Subject: kern/22972: signal related problem
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <manu@netbsd.org>
List: netbsd-bugs
Date: 09/27/2003 10:11:15
>Number: 22972
>Category: kern
>Synopsis: signal related problem
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Sep 27 10:12:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator: Emmanuel Dreyfus
>Release: NetBSD-current/macppc
>Organization:
The NetBSD Project
>Environment:
don't have it at hands yet
>Description:
I observed the problem with pkgsrc/mail/jchkmail. On NetBSD-1.6.1/macppc it works fine. On NetBSD-current/macppc, it will fork two threads (which is the normal behaviour) and then one of the two threads die. In the log, you can find a "SUPERVISOR DIED?" message.
I has not been able to track down the problem to a simple program yet, so here is the story with jchkmail sources:
in src/j-main.c:709 is the following function call:
sleep (DT_ALARM);
If we add a syslog(LOG_DEBUG, "before sleep") ans syslog(LOG_DEBUG, "after sleep") around this sleep() call, we discover that after a short time, the program enters sleep() but never leave it.
When this happens, the program got a SIGALRM and entered j_father_sig_handler() in the same file, line 328. After this signal handler returns, we can see with ps -axl that the program is sleeping in "select", whereas we would exepct "nanosleep".
In my opinion, the signal handler threw us somewhere else. I see no code that could corrupt the stack in the signal handler, it only calls signal() and syslog() when it receive a SIGALRM.
>How-To-Repeat:
cd /usr/pkgsrc/mail/jchkmail
make install
/usr/pkg/etc/rc.d/jchkmail start
>Fix:
None known yet.
>Release-Note:
>Audit-Trail:
>Unformatted: