Subject: Re: lib/35969 (ghc-6.4.2 from pkgsrc fails to compile)
To: None <ad@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Andrew Doran <ad@NetBSD.org>
List: netbsd-bugs
Date: 03/11/2007 18:45:02
The following reply was made to PR lib/35969; it has been noted by GNATS.
From: Andrew Doran <ad@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/35969 (ghc-6.4.2 from pkgsrc fails to compile)
Date: Sun, 11 Mar 2007 18:43:57 +0000
PID LID S FLAGS STRUCT LWP * UAREA * WAIT
12719 3 3 0x84 0xcc1aee20 0xcc118ce0 parked
2 3 0x84 0xcc1ae820 0xcc144ce0 select
1 3 0x84 0xcc1b8000 0xcc1e8ce0 parked
So there are two parked threads with no pending wakeup (LW_UNPARKED)
which is expected. User level backtraces from gcore:
#0 0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
#1 0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
#2 0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
#3 0x08f6bacc in waitCondition ()
#4 0x0931500c in ?? ()
#5 0x093058c8 in cached_trec_headers ()
#6 0x00000000 in ?? ()
#0 0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
#1 0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
#2 0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
#3 0x08f6bacc in waitCondition ()
#0 0xbbad3fbf in select () from /usr/lib/libc.so.12
#1 0xbbb9a585 in select () from /usr/lib/libpthread.so.0
#2 0x08df41d1 in s5hN_ret ()
#3 0x08f707b9 in StgRun ()
Looking at thread 0:
(gdb) frame 1
#1 0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
(gdb) frame 2
#2 0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
(gdb) info frame
Stack level 2, frame at 0xbfbfd8d0:
eip = 0xbbb9b45d in pthread_cond_wait; saved eip 0x8f6bacc
called by frame at 0xbfbfd8d4, caller of frame at 0xbfbfd890
Arglist at 0xbfbfd8c8, args:
Locals at 0xbfbfd8c8, Previous frame's sp is 0xbfbfd8d0
Saved registers:
ebx at 0xbfbfd8bc, ebp at 0xbfbfd8c8, esi at 0xbfbfd8c0, edi at 0xbfbfd8c4, eip at 0xbfbfd8cc
Arguments:
(gdb) x/20a 0xbfbfd8c8
0xbfbfd8c8: 0x9315000 0x8f6bacc <waitCondition+16> 0x931500c 0x93058c8 <sched_mutex>
0xbfbfd8d8: 0x0 0x0 0x0 0x93058cc <sched_mutex+4>
0xbfbfd8e8: 0x10000 0x8f64c90 <waitForCapability+29> 0x931500c 0x93058c8 <sched_mutex>
0xbfbfd8f8: 0x0 0x9306208 <stg_END_TSO_QUEUE_closure> 0xbaab5000 0x9306208 <stg_END_TSO_QUEUE_closure>
0xbfbfd908: 0xbaef0038 0x8f6e8a6 <schedule+84> 0x93058c8 <sched_mutex> 0xbfbfd958
First arg is a condition variable (magic 0x55550005):
(gdb) x/20a 0x931500c
0x931500c: 0x55550005 0x0 0xbc000000 0xbc000034
0x931501c: 0x93058c8 <sched_mutex> 0x0 0x0 0x0
0x931502c: 0x0 0x0 0x0 0x0
0x931503c: 0x0 0x0 0x0 0x0
0x931504c: 0x0 0x0 0x0 0x0
And the pthread noted in it (magic 0x11110001):
(gdb) x/50a 0xbc000000
0xbc000000: 0x11110001 0x0 0x1 0x1
0xbc000010: 0x0 0x0 0x0 0x0
0xbc000020: 0x0 0x1 0x19 0x0
^ pt_sleeponq
0xbc000030: 0xb400002c 0x0 0x9315014 0x9315014
^ pt_sleepobj ^ pt_sleepq
It appears to be happily asleep, and has not been awoken, so nothing
wrong here. mutex is noted in the CV so no wakeup has occurred. Loooking
at thread 2:
(gdb) thread 2
[Switching to thread 2 (process 209327)]#0 0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
(gdb) info frame
Stack level 1, frame at 0xb3fffec8:
eip = 0xbbb9befa in pthread__park; saved eip 0xbbb9b45d
called by frame at 0xb3ffff08, caller of frame at 0xb3fffe98
Arglist at 0xb3fffec0, args:
Locals at 0xb3fffec0, Previous frame's sp is 0xb3fffec8
Saved registers:
ebx at 0xb3fffeb4, ebp at 0xb3fffec0, esi at 0xb3fffeb8, edi at 0xb3fffebc, eip at 0xb3fffec4
(gdb) x/10a 0xb3fffec0
0xb3fffec0: 0xb3ffff00 0xbbb9b45d <pthread_cond_wait+249> 0xb0000000 0x9304ed0 <thread_ready_cond+4>
0xb3fffed0: 0x9304ed4 <thread_ready_cond+8> 0x0 0x1 0x9304ed0 <thread_ready_cond+4>
Different CV; however it seems that no thread is asleep on its queue
in this case.
(gdb) x/10a 0x9304ecc
0x9304ecc <thread_ready_cond>: 0x55550005 0x0 0x0 0x9304ed4 <thread_ready_cond+8>
0x9304edc <thread_ready_cond+16>: 0x0 0x0 0x16 0x0
Finding the thread from the locals and seeing what it is doing.
(gdb) x/10a 0xb3fffec0
0xb3fffec0: 0xb3ffff00 0xbbb9b45d <pthread_cond_wait+249> 0xb0000000 0x9304ed0 <thread_ready_cond+4>
0xb3fffed0: 0x9304ed4 <thread_ready_cond+8> 0x0 0x1 0x9304ed0 <thread_ready_cond+4>
0xb3fffee0: 0x0 0xbbb9b372 <pthread_cond_wait+14>
Here's the thread:
(gdb) x/50a 0xb0000000
0xb0000000: 0x11110001 0x2 0x3 0x1
0xb0000010: 0x0 0x1 0x0 0x0
0xb0000020: 0x0 0x0 0x0 0xb4000000
^ pt_sleeponq
0xb0000030: 0xbbb9fca8 <pthread__allqueue> 0x0 0x9304ed4 <thread_ready_cond+8> 0x0
^ pt_sleepobj ^ pt_sleepq
The thread is not sleeping and is not on a sleep queue, yet it's parked.
pt_sleepq indicates that it was last waiting on the the CV from the
stack trace. From the CV above, the mutex pointer is NULL meaning that
a wake up has occurred - no more waiters on the CV so the mutex pointer
has been cleared. So there is some synchronization failure occurring
between removing the LWP from its sleep queue and unparking it.