NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/57179 (occasional pkg_add core dumps)
The following reply was made to PR bin/57179; it has been noted by GNATS.
From: Christof Meerwald <cmeerw%cmeerw.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, riastradh%netbsd.org@localhost
Subject: Re: bin/57179 (occasional pkg_add core dumps)
Date: Mon, 11 Dec 2023 20:31:30 +0100
Ok, just did some debugging on this now.
To reproduce the issue it seems to be important that PKG_PATH is
actually set to
http://cdn.NetBSD.org/pub/pkgsrc/packages/NetBSD/aarch64/10.0/All
A consequence of this is that this URL actually redirects to
http://cdn.netbsd.org/pub/pkgsrc/packages/NetBSD/aarch64/10.0/All/
So we'll actually see two different hosts in the connection cache:
cdn.NetBSD.org and cdn.netbsd.org
Source code I am referring to is
http://cvsweb.netbsd.org/bsdweb.cgi/src/external/bsd/fetch/dist/libfetch/common.c?annotate=1.4&only_with_tag=MAIN
First bug I noticed is in "fetch_cache_get" where last_conn is
initialized to NULL, but never updated. This is probably just a
resource/memory leak as we'll always get into the "else" branch in
line 382 (and throw away the initial part of the connection_cache).
But the main issue (the one that is then resulting in the core dumps)
is in fetch_cache_put. There is actually two parts to it.
First part (the one that leads to the memory corruption) is that after
closing a connection in line 421, on the next iteration, "last" will
be set to that closed connection. So if we then also close the next
connection on that next iteration, the "next_cached" link in the list
isn't updated correctly (as we are updating the "next_cached" of the
closed connection). This then leads to the core dump on the next call
to "fetch_cache_put".
Now the remaining issue is, why are we even closing two connections
from the connection_cache in one single call to fetch_cache_put? The
issue here is that once we reach the host_count limit, we don't reset
the "host_count" and ultimately close all remaining connections in
connection_cache (even if those connections are for different hosts
that haven't reached the host_count limit). In my case
connection_cache contained 4 connections for "cdn.NetBSD.org",
followed by a connection for "cdn.netbsd.org". When trying to put
another "cdn.NetBSD.org" connection into the cache, it realised that
the fourth connection in the cache is over the host limit, closed it,
and continued to the last "cdn.netbsd.org" connection. But as
host_count wasn't decremented, it then proceeded to also close that
"cdn.netbsd.org" connection (resulting in the linked-list corruption).
Hope that helps.
Home |
Main Index |
Thread Index |
Old Index