NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: kern/59339: heartbeat watchdog fires since 10.99.14



The following reply was made to PR kern/59339; it has been noted by GNATS.

From: matthew green <mrg%eterna23.net@localhost>
To: gnats-bugs%netbsd.org@localhost, prlw1%cam.ac.uk@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
    netbsd-bugs%netbsd.org@localhost
Subject: re: kern/59339: heartbeat watchdog fires since 10.99.14
Date: Tue, 22 Apr 2025 07:54:55 +1000

 > System panicked: cpu0: softints stuck for 16 seconds
 
 this means cpu0 is locked up, and some other cpu detected it and
 crashed.  the stack below is not the interesting cpu, but you
 found the relevant LWPs to inspect:
 
 > crash> bt
 > end() at 0
 > kern_reboot() at kern_reboot+0x93
 > vpanic() at vpanic+0x16b
 > panic() at vprintf
 > heartbeat() at heartbeat+0x1f2
 > hardclock() at hardclock+0x9c
 > Xresume_lapic_ltimer() at Xresume_lapic_ltimer+0x1e
 > --- interrupt ---
 > mutex_spin_exit() at mutex_spin_exit+0x5a
 > callout_softclock() at callout_softclock+0xad
 > softint_dispatch() at softint_dispatch+0x8f
 > crash> ps
 > PID     LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 > 2917 > 2917 7   0   8060000   ffff8052e4a14000                tar
 > 0    >    5 7   0       200   ffff8055abee1c00          softclk/0
 
 can you do "bt/a ffff8052e4a14000" and "bt/a ffff8055abee1c00"?
 
 or with the other crash, any process on the cpu reported (always
 cpu0, i think?) with the ">" state like above (ie, running.)
 
 i expect the above will show that softclk/0 has fast switched
 the tar process (ie, softclk/0 bt may end up being the same as
 tar with some additional frames.)  normally, there should only
 be one active LWP per cpu, but fast softints do.
 
 thanks.
 
 
 .mrg.
 


Home | Main Index | Thread Index | Old Index