Should we use _malloc_prefork and _malloc_postfork?

To: tech-userlevel%NetBSD.org@localhost
Subject: Should we use _malloc_prefork and _malloc_postfork?
From: Beverly Schwartz <bschwart%bbn.com@localhost>
Date: Mon, 7 Jan 2013 15:25:43 -0500

We had a problem with a Python2.6 program running on NetBSD-6
where the main application would fork() and then exec() some
other NetBSD program.  We would occassionally observe the
child process hang.  Running on a single core, this would
occur maybe once a week.  When run on multi-core, this would
occur once every few hours.  Our Python program is multi-threaded
(about 15 threads), and we use Python 2.6 built from pkgsrc,
so it is using native pthreads.

After a lot of debugging I traced it down to an
interaction between pthreads, the malloc library, and
the malloc library's use of semaphores and locks and the
rules for what can and can not be done after forking.

The IEEE Std 1003.1 Specification says that only async-signal-safe
calls may be called in an after-fork child before exec is called.
http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html
The number of async-signal-safe calls are few, and malloc and free
are not included in this list.
https://www.securecoding.cert.org/confluence/display/seccode/SIG30-C.+Call+only+asynchronous-safe+functions+within+signal+handlers

In practice, this requirement is only necessary if the program
is threaded, and if there is more than 1 thread running.  Thus,
daemonization (using double fork) at the beginning of a program
before it launches its threads is okay, even though it is technically
not compliant with the spec.

Python does *not* register handlers with pthread_atfork, but it
does call its own function PyOS_AfterFork after calling fork.
This is in the C code that implements Python, so calling fork in
Python (or anything that calls fork such as the functions in the
subprocess module) guarantees a call to PyOS_AfterFork.
PyOS_AfterFork calls functions which are not async-signal-safe.
This is true for both Python2.* and Python3.*.

Visual inspection of PyOS_AfterFork uncovered a number of calls
to functions that are *not* async-signal-safe.  However, the
vast majority of the problem calls are to malloc and free.

When a child is forked, it gets a copy of the executing thread
only.  If another thread happens to hold a mutex in the malloc
module, and then a child tries to do a malloc or free, the child
will hang, because the thread which holds the mutex will never
come alive and free it.

I have attached a simple non-compliant program to illustrate the
problem with malloc and free. In this program, there are threads
(four by default, configurabale via command line) which just call
malloc and free in a loop. The main thread is a loop which forks a
child and waits for it.  The child does a single malloc and then exits.
The parent process will print the pid of the child forked, and then
print the pid and status after waiting for the child. Without the
changes recommended below, the program will eventually hang while
waiting for a child.  When this happens, if the child is subsequently
killed with a signal, the wait call in the parent will return a
non-zero status, and execution of the parent will resume.

A simple workaround is to take all the malloc module locks before
calling fork, then release them after the fork. This guarantees that
the locks are held in the executing thread, and thus the child will
be able to release them. This is what the _malloc_prefork and
_malloc_postfork functions do.

In both malloc.c and jemalloc.c, there are _malloc_prefork and
_malloc_postfork functions defined, but they are unused. They are
also not documented, and I am guessing not intended for general use.

Taking the malloc module locks must be the last thing done before
calling fork, and releasing the locks must be the first thing done
after calling fork. This is so other functions registered with
pthread_atfork can use malloc and free in their functions. This can
be guaranteed by registering _malloc_prefork and _malloc_postfork in
pthread_init. pthread_init is called before main, so no user program
can possibly get a pthread_atfork handler registered before the ones
in pthread_init. Here is the diff for pthread.c:

diff --git a/netbsd/src/lib/libpthread/pthread.c 
b/netbsd/src/lib/libpthread/pthread.c
index 4880957..6ba7a00 100644
--- a/netbsd/src/lib/libpthread/pthread.c
+++ b/netbsd/src/lib/libpthread/pthread.c
@@ -55,6 +55,9 @@ __RCSID("$NetBSD: pthread.c,v 1.125.4.1 2012/05/07 03:12:33 
riz Exp $");
 #include "pthread.h"
 #include "pthread_int.h"
 
+/* Need to use libc-private names for malloc pre/post fork operations. */
+#include "../libc/include/extern.h"
+
 pthread_rwlock_t pthread__alltree_lock = PTHREAD_RWLOCK_INITIALIZER;
 RB_HEAD(__pthread__alltree, __pthread_st) pthread__alltree;
 
@@ -230,6 +233,7 @@ pthread__init(void)
        }
 
        /* Tell libc that we're here and it should role-play accordingly. */
+       pthread_atfork(_malloc_prefork, _malloc_postfork, _malloc_postfork);
        pthread_atfork(NULL, NULL, pthread__fork_callback);
        __isthreaded = 1;
 }


I did find one subtle bug in the _malloc_prefork and
_malloc_postfork implementation in jemalloc.c. In both the prefork
and postfork functions, the arenas_mtx is taken and released. I am
wondering why it isn't just held in the prefork call to be released
in the postfork call. Here's the diff for jemalloc.c:

diff --git a/netbsd/src/lib/libc/stdlib/jemalloc.c 
b/netbsd/src/lib/libc/stdlib/jemalloc.c
index 81dbf45..7905d9d 100644
--- a/netbsd/src/lib/libc/stdlib/jemalloc.c
+++ b/netbsd/src/lib/libc/stdlib/jemalloc.c
@@ -3915,7 +3915,6 @@ _malloc_prefork(void)
                if (arenas[i] != NULL)
                        malloc_mutex_lock(&arenas[i]->mtx);
        }
-       malloc_mutex_unlock(&arenas_mtx);
 
        malloc_mutex_lock(&base_mtx);
 
@@ -3933,7 +3932,6 @@ _malloc_postfork(void)
 
        malloc_mutex_unlock(&base_mtx);
 
-       malloc_mutex_lock(&arenas_mtx);
        for (i = 0; i < narenas; i++) {
                if (arenas[i] != NULL)
                        malloc_mutex_unlock(&arenas[i]->mtx);


Has anyone had experience using _malloc_prefork and
_malloc_postfork? Is there any reason why they are not registered in
pthread_init?  While the Posix specification doesn't require that we
add these handlers, in practice, they are very useful because there
is so much non-compliant software out there. (Python is only one
example; I have found others.)  It's a simple and reasonable change
that avoids the vast majority of bugs caused by non-compliant programs.

I have tried running my non-compliant program with the modifications
described above. My program has not hung once.

My non-compliant program runs correctly on my Mac (OS X 10.8.2) and on
Fedora 15 Linux installed on an x86 host.  I haven't looked into whether
that's because of an explicit locking strategy or not.

The key hard question is whether a tiny performance hit for fork is a
good trade for making buggy programs work better.  Given that fork is
relatively rare, and many uses of fork are likely to be buggy, it seems
like this change would be a net win.

Bev Schwartz
BBN Technologies
bschwart%bbn.com@localhost

Attachment: thread_fork.c
Description: Binary data

Follow-Ups:
- Re: Should we use _malloc_prefork and _malloc_postfork?
  - From: Christos Zoulas

Prev by Date: Re: fsck SIGINFO
Next by Date: Re: Should we use _malloc_prefork and _malloc_postfork?
Previous by Thread: the zen of SIGPIPE
Next by Thread: Re: Should we use _malloc_prefork and _malloc_postfork?
Indexes:

Home | Main Index | Thread Index | Old Index