NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/58871: Stuck processes



The following reply was made to PR kern/58871; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Benny Siegert <bsiegert%gmail.com@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: kern/58871: Stuck processes
Date: Tue, 10 Dec 2024 20:56:26 +0000

 > Date: Tue, 10 Dec 2024 20:54:45 +0100
 > From: Benny Siegert <bsiegert%gmail.com@localhost>
 > 
 > This happened again, so I did some of the things you said at least:
 > 
 > Am 03.12.24 um 21:03 schrieb Taylor R Campbell:
 > > Can you start crash(8) and get output from `ps', `ps/w', and `show all
 > > tstiles'?
 
 Would really like `ps' and `show all tstiles' too next time!
 
 > Is it normal that ps/w prints output continuously until you press Ctrl+C?
 
 Is it just taking a long time or it is going in a loop?  If there's a
 lot of processes/threads and the serial console is slow or the CPUs
 are busy running other threads it might just take a long time.
 
 > > Can you start crash(8) and stack traces from the processes not in RUN
 > > state, like the tstile one with `bt 0t18154'?
 
 Sorry, I meant `bt/t 0t18154'.
 
 > I tried looking at the "find" process in tstile (3770) and the 
 > "pipeline.test" process (2534) but got the following. Is this a bug in 
 > crash(8)?
 > 
 > According to htop, process 2534 was hogging 100% of one core. It looks 
 > like it was actually spinning on the CPU?
 > 
 > crash> bt/t 2534
 > trace: pid 9524 not found
 > crash> bt/t 3770
 > trace: pid 14192 not found
 
 You need to use `0t2534' for pid 2534 in decimal; otherwise the input
 is read in hexadecimal.  So these should have been:
 
 bt/t 0t2534
 bt/t 0t3770
 
 (Whether this is a bug in crash(8) is a matter of perspective...
 Certainly it is confusing that the input is interpreted as decimal and
 the output is presented in hexadecimal here!)
 
 > Sorry, this is probably not the most helpful answer. When this 
 > invariably happens again, I will try the other debugging techniques.
 
 OK, thanks!
 
 > Would it be useful to run with a LOCKDEBUG kernel?
 
 Sure, that could help, but it'll run muuuuuch slower and might paper
 over the symptoms by having less concurrency.
 


Home | Main Index | Thread Index | Old Index