NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/58871: Stuck processes



The following reply was made to PR kern/58871; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Benny Siegert <bsiegert%gmail.com@localhost>
Cc: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost,
	gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/58871: Stuck processes
Date: Sat, 4 Jan 2025 04:18:11 +0000

 > Date: Tue, 31 Dec 2024 21:50:35 +0100
 > From: Benny Siegert <bsiegert%gmail.com@localhost>
 > 
 > Am 28.12.24 um 16:13 schrieb Taylor R Campbell:
 > > Just to be absolutely sure, can you send output of the following on
 > > your kernel?
 > > 
 > > $ ident /netbsd | grep -F cpuswitch.S
 > 
 > This is the NetBSD 10.1 release, so:
 > 
 >       $NetBSD: cpuswitch.S,v 1.105.12.1 2023/07/31 13:36:32 martin Exp $
 > 
 > > And can you share the disassembly of the function cpu_switchto in your
 > > kernel, with objdump or gdb?
 > 
 > (gdb) disassemble cpu_switchto
 > Dump of assembler code for function cpu_switchto:
 > [...]
 >     0x8009eaac <+48>:    mcr     15, 0, r6, cr13, cr0, {4}
 >     0x8009eab0 <+52>:    dmb     sy
 >     0x8009eab4 <+56>:    str     r6, [r5, #896]  ; 0x380
 >     0x8009eab8 <+60>:    dmb     sy
 
 So this has the barriers I added in pullup-10 #264 for PR kern/57240:
 Missing store-before-load barriers in cpu_switchto
 <https://gnats.NetBSD.org/57240>.  (Actually the store-before-load
 barrier (second dmb above) was already there, but it was missing a
 store-before-store barrier (first dmb above).  In any case, they're
 both there now -- and in softint_switch.)  Which rules out the
 hypothesis of a missing barrier for arm32 locks.
 
 Now it occurs to me that the lock in question is a reader/writer lock,
 not a mutex.  So probably what's going on here is that some thread
 holds (or leaked) a reader lock on the vnode, which doesn't show up in
 show all tstiles -- tracking a list of readers is costly (not even
 sure if LOCKDEBUG does that).  But maybe we can make a guess by
 finding what the file is through the syscall arguments to find.
 
 > > Also, if you have netbsd.gdb for the kernel, can you force a crash
 > > dump (enter ddb and run `sync'), so we can see the arguments and
 > > locals in the stack trace?  Curious to see what file find is stuck on.
 > 
 > Is netbsd.gdb part of the release? I haven't found it on cdn.netbsd.org. 
 > The "debug" set contains a file /usr/libdata/debug/netbsd-GENERIC.debug, 
 > but that's all I have.
 
 That's the debug data part of netbsd.gdb, which if combined with the
 netbsd kernel is good enough for a debugger.
 


Home | Main Index | Thread Index | Old Index