Subject: better signal delivery to threads
To: None <tech-kern@netbsd.org, tech-userlevel@netbsd.org>
From: Matthias Drochner <M.Drochner@fz-juelich.de>
List: tech-kern
Date: 08/22/2005 21:04:32
This is a multipart MIME message.

--==_Exmh_95003003205120
Content-Type: text/plain; charset=us-ascii


Hi -
have found out why the Python-2.4 selftest fails (hangs);
this problem made me stick to 2.3 as default all the time.

Signals are delivered to the wrong thread.
There are 2 threads in the Python interpreter (one
the main thread of the test script, the other appearently a
leftover of another script executed previously, or perhaps
related to logging) which don't block the signal in question,
and it happens reproducibly that the other one gets the signal
while the intended target is left in sigsuspend().

(This might be technically not the fault of NetBSD's libpthread.
Using threads and signals to processes together is just a bad idea.
But in cases like script interpreters, or unsuspecting libraries
pulled in by threaded programs, we have to live with a mix
of these worlds.)

We might be more clever to select the thread to deliver a signal
to. Currently, libpthread takes a randon thread not blocking the
signal, just preferring threads which are not blocked in a system
call (as I understand the code). The appended patch modifies this
to prefer threads which are in sigsuspend(). This makes the Python
selftest succeed. (just a proof-of-idea, probably not exact enough)

Thinking about this, would there be anything wrong with waking
up multiple threads?
Only one of all threads waiting in sig*wait*() because it is
specified so, but otherwise all blocked threads (modulo sigmask)
could be awoken.
There is a difference between "delivering a signal" and "awake
a thread" of course -- a signal handler is to be called only once.

At that point it should be made sure probably that the threads
are not awoken before the signal handler is finished, to avoid
deadlock situations.
But what happens if a signal handler is left per longjmp()?

Anyone can provide some insight?

best regards
Matthias



--==_Exmh_95003003205120
Content-Type: text/plain ; name="thrsel.txt"; charset=us-ascii
Content-Description: thrsel.txt
Content-Disposition: attachment; filename="thrsel.txt"

Index: pthread_sig.c
===================================================================
RCS file: /cvsroot/src/lib/libpthread/pthread_sig.c,v
retrieving revision 1.41
diff -u -p -r1.41 pthread_sig.c
--- pthread_sig.c	26 Jul 2005 20:16:07 -0000	1.41
+++ pthread_sig.c	22 Aug 2005 15:33:05 -0000
@@ -693,9 +693,18 @@ pthread__signal(pthread_t self, pthread_
 			if (!__sigismember14(&target->pt_sigmask,
 			    si->si_signo)) {
 				if (target->pt_blockgen == target->pt_unblockgen) {
-					good = target;
-					/* Leave target locked */
-					break;
+					if (target->pt_state == PT_STATE_BLOCKED_QUEUE 
+					    && target->pt_sleepq == &pt_sigsuspended) {
+						if (good)
+							pthread_spinunlock(self, &good->pt_siglock);
+						good = target;
+						/* Leave target locked */
+						break;
+					} else if (good == NULL) {
+						good = target;
+						/* Leave target locked */
+						continue;
+					}
 				} else if (okay == NULL) {
 					okay = target;
 					/* Leave target locked */

--==_Exmh_95003003205120--