Actually it happened that modifiying pthread_atfork() to stop malloc()ing is enough to address the problem. I have landed the changes and removed '#if 0' kludge. Thanks! On 01.02.2020 13:59, Kamil Rytarowski wrote: > On 31.01.2020 22:10, Andrew Doran wrote: >> On Fri, Jan 31, 2020 at 06:55:00PM -0000, Christos Zoulas wrote: >> >>> In article <724af477-010b-9ddf-6ece-e23d7cf59079%gmx.com@localhost>, >>> Kamil Rytarowski <n54%gmx.com@localhost> wrote: >>>> -=-=-=-=-=- >>>> -=-=-=-=-=- >>>> >>>> On 31.01.2020 03:38, Christos Zoulas wrote: >>>>> And it is fixed now. >>>>> >>>>> christos >>>>> >>>> >>>> OK. I am going to submit a bug report upstream and get some feedback >>>> what is the way forward here, delaying initialization. >>> >>> I think that the way forward (on our side) is to do away with libpthread, >>> merge it with libc and kill all the stub nonsense. >> >> Agreed. >> >> pthread__init() does some expensive stuff like _lwp_ctl(). I think we can >> safely & without hacks defer a lot of that till the first pthread_create(). >> >> Andrew >> > > This libc-libpthread split/merge is a red herring. > > The problem here is with a mutual dependencies between POSIX threads > library and malloc library. > > I did some investigation and here are my findings: > > 1. jemalloc abuses initialization and initializes self very early, with > a constructor: > > /* > * If an application creates a thread before doing any allocation in the > main > * thread, then calls fork(2) in the main thread followed by memory > allocation > * in the child process, a race can occur that results in deadlock > within the > * child: the main thread may have forked while the created thread had > * partially initialized the allocator. Ordinarily jemalloc prevents > * fork/malloc races via the following functions it registers during > * initialization using pthread_atfork(), but of course that does no good if > * the allocator isn't fully initialized at fork time. The following > library > * constructor is a partial solution to this problem. It may still be > possible > * to trigger the deadlock described above, but doing so would involve > forking > * via a library constructor that runs before jemalloc's runs. > */ > #ifndef JEMALLOC_JET > JEMALLOC_ATTR(constructor) > static void > > jemalloc_constructor(void) { > malloc_init(); > } > #endif > > Relevant commit: > > commit 20f1fc95adb35ea63dc61f47f2b0ffbd37d39f32 > Author: Jason Evans <je%fb.com@localhost> > Date: Tue Oct 9 14:46:22 2012 -0700 > > Fix fork(2)-related deadlocks. > > Add a library constructor for jemalloc that initializes the allocator. > This fixes a race that could occur if threads were created by the main > thread prior to any memory allocation, followed by fork(2), and then > memory allocation in the child process. > > Fix the prefork/postfork functions to acquire/release the ctl, prof, and > rtree mutexes. This fixes various fork() child process deadlocks, but > one possible deadlock remains (intentionally) unaddressed: prof > backtracing can acquire runtime library mutexes, so deadlock is still > possible if heap profiling is enabled during fork(). This deadlock is > known to be a real issue in at least the case of libgcc-based > backtracing. > > Reported by tfengjun. > > 2. FreeBSD added a hack and an internal pthread_mutex_init() version > called: _pthread_mutex_init_calloc_cb().. it passes a callback pointer > to jemalloc's tiny calloc(). > > This is very ugly and I consider it as the wrong way of boostraping malloc. > > 3. There is a problem inside libpthread. It as designed to not malloc() > early to not trigger malloc initialization, however it was broken as it > calls at_fork functions: > > #0 malloc (size=size@entry=16) at > /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2052 > #1 0x0000776b71d71a44 in af_alloc () at > /usr/src/lib/libc/gen/pthread_atfork.c:80 > #2 af_alloc () at /usr/src/lib/libc/gen/pthread_atfork.c:74 > #3 _pthread_atfork (prepare=prepare@entry=0x0, parent=parent@entry=0x0, > child=child@entry=0x776b7220b3e5 <pthread__fork_callback>) > at /usr/src/lib/libc/gen/pthread_atfork.c:121 > #4 0x0000776b7220cacd in pthread__init () at > /usr/src/lib/libpthread/pthread.c:260 > #5 0x0000776b71d7a585 in _libc_init () at > /usr/src/lib/libc/misc/initfini.c:128 > > These at_fork routines caused also issues with false-positives in Leak > Sanitizer. I had to pacify the sanitizer and disable tracking of its > allocations. > > > This patch removes '#if 0' hack from src/lib/libpthread and switches > at_fork to mmap()+munmap(). > > http://netbsd.org/~kamil/patch-00219-libpthread-libc-jemalloc.txt > > This test disabled the constructor hack: > > http://netbsd.org/~kamil/patch-00220-jemalloc-disable-constructor.txt > > With these changes everything seems to work. > > In order to avoid the FreeBSD specific hack with the constructor and > initialize jemalloc always during libc bootstrap I propose the following > approach: > > - add __libc_malloc_init() call in _libc_init() > - redirect __libc_malloc_init() to jemalloc__init() with jemalloc > - otherwise redirect to an empty stub > > Here is a patch that does everything and works fine for me. > > http://netbsd.org/~kamil/patch-00222-jemalloc-enhancements.txt > > There are no longer jemalloc calls before being ready and jemalloc is > still initialized always but in its proper time. > > (gdb) r > The program being debugged has been started already. > Start it from the beginning? (y or n) y > Starting program: /tmp/a.out > > Breakpoint 2, jemalloc__init () at > /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:3209 > 3209 malloc_init(); > (gdb) c > Continuing. > > Breakpoint 3, malloc_init_hard () at > /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1533 > 1533 malloc_init_hard(void) { > (gdb) bt > #0 malloc_init_hard () at > /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1533 > #1 0x00007f7ff763f9c4 in ?? () from /usr/lib/libc.so.12 > #2 0x00007f7ff7ef9400 in ?? () > #3 0x00007f7ff763ac99 in _init () from /usr/lib/libc.so.12 > #4 0x0000000000000000 in ?? () >
Attachment:
signature.asc
Description: OpenPGP digital signature