NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/58871: Stuck processes
The following reply was made to PR kern/58871; it has been noted by GNATS.
From: Benny Siegert <bsiegert%gmail.com@localhost>
To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: kern/58871: Stuck processes
Date: Tue, 10 Dec 2024 20:54:45 +0100
This happened again, so I did some of the things you said at least:
Am 03.12.24 um 21:03 schrieb Taylor R Campbell:
> Can you start crash(8) and get output from `ps', `ps/w', and `show all
> tstiles'?
Crash version 10.0_STABLE, image version 10.0_STABLE.
Kernel compiled without options LOCKDEBUG.
Output from a running system is unreliable.
crash> ps/w
PID LID COMMAND EMUL PRI WAIT-MSG WAIT-CHANNEL
21931>21931 crash netbsd 37 0
24382 24382 sh netbsd 36 wait 9262b88c
15343>15343 screen netbsd 37 0
10543 10543 screen netbsd 38 pause 92c32c80
8677 8677 sh netbsd 35 wait 957b854c
3770 3770 find netbsd 26 tstile 92b31f80
28839 28839 postdrop netbsd 43 netio 929e5150
21098 21098 sendmail netbsd 43 pipe_rd 92d0c814
13750 13750 tee netbsd 43 pipe_rd 91d21df4
7775 7775 sh netbsd 42 wait 95b6bacc
24218 24218 sh netbsd 43 wait 9170ea4c
28302 28302 cron netbsd 43 pipe_rd 92d0c8bc
2374 2374 go netbsd 40 tstile 95430f80
8089 > 8089 go netbsd 25 0
2534 17497 pipeline.test netbsd 43 lwpwait 92a53a54
2534 28139 pipeline.test netbsd 43 0
24155 24155 sh netbsd 43 0
24215 18770 go netbsd 43 parked 95594c80
24215 8412 go netbsd 43 parked 92bdb3c0
24215 24883 go netbsd 43 parked 92c40a00
24215 8759 go netbsd 43 parked 9519e940
24215 19943 go netbsd 40 pipe_rd 9558aabc
24215 23618 go netbsd 43 wait 91ec930c
24215 9006 go netbsd 40 parked 92bdb680
24215 5455 go netbsd 41 wait 91ec930c
24215 26737 go netbsd 43 kqueue 95776eb8
24215 13345 go netbsd 43 parked 92d18d00
24215 11347 go netbsd 42 parked 9519d900
24215 7163 go netbsd 43 parked 92a678c0
24215 24215 go netbsd 38 parked 95531040
4239 12336 result_adapter netbsd 43 parked 955de1c0
4239 26804 result_adapter netbsd 43 kqueue 94980678
4239 7242 result_adapter netbsd 43 parked 92b5d900
4239 10307 result_adapter netbsd 42 parked 91f28780
4239 10406 result_adapter netbsd 43 parked 92b5d380
4239 20061 result_adapter netbsd 43 parked 91d153c0
4239 4430 result_adapter netbsd 43 parked 925377c0
4239 19249 result_adapter netbsd 43 wait 955d40cc
4239 2513 result_adapter netbsd 43 parked 951b36c0
4239 4239 result_adapter netbsd 42 parked 95538b80
2274 27130 rdb netbsd 43 parked 9557ec40
2274 27888 rdb netbsd 43 0
2274 18470 rdb netbsd 43 parked 955949c0
2274 2725 rdb netbsd 43 parked 9557bc00
2274 22316 rdb netbsd 43 parked 95538340
2274 9005 rdb netbsd 43 wait 955d484c
2274 8849 rdb netbsd 43 parked 9519d0c0
2274 22878 rdb netbsd 43 parked 95461180
2274 15929 rdb netbsd 43 0
2274 2274 rdb netbsd 43 0
19250 13439 bbagent netbsd 43 0
19250 6648 bbagent netbsd 43 0
19250 15191 bbagent netbsd 43 lwpwait 95b6b0d4
19250 10220 bbagent netbsd 43 0
20545 20545 python3.11 netbsd 43 wait 957b804c
3732 1996 python3.11 netbsd 43 lwpwait 9170e554
3732 3732 python3.11 netbsd 43 0
3053 3053 getty netbsd 39 ttyraw 915e0c28
3187 3187 getty netbsd 39 ttyraw 915e0a28
3497 3497 getty netbsd 39 ttyraw 915e0828
2344 2344 login netbsd 42 wait 9170e2cc
662 662 cron netbsd 43 nanoslp 929b1300
658 658 estd netbsd 43 nanoslp 92537a80
648 648 inetd netbsd 40 kqueue 9295dab8
2892 2979 node_exporter netbsd 41 parked 929b1b40
2892 671 node_exporter netbsd 43 parked 91f284c0
2892 669 node_exporter netbsd 43 parked 929b1880
2892 668 node_exporter netbsd 43 kqueue 92a09d38
2892 664 node_exporter netbsd 43 parked 929b15c0
2892 2892 node_exporter netbsd 43 parked 918bb300
2853 2853 qmgr netbsd 43 kqueue 9295d1b8
2859 2859 master netbsd 43 0
333 29950 python3.11 netbsd 43 parked 91808d00
333 333 python3.11 netbsd 43 wait 91d1a2cc
411 411 sshd netbsd 43 0
394 394 ntpd netbsd 43 pause 91f28200
2358 328 bootstrapswarm netbsd 43 parked 91f0f480
2358 396 bootstrapswarm netbsd 43 parked 91d15100
2358 389 bootstrapswarm netbsd 40 parked 91f0f740
2358 385 bootstrapswarm netbsd 42 parked 91f0f1c0
2358 384 bootstrapswarm netbsd 41 wait 91ec908c
2358 1960 bootstrapswarm netbsd 43 0
2358 2641 bootstrapswarm netbsd 43 nanoslp 91ee4700
2358 2358 bootstrapswarm netbsd 43 parked 91ec16c0
1998 1998 multilog netbsd 43 pipe_rd 91d212cc
2674 2674 multilog netbsd 41 pipe_rd 91d21224
1501 1501 sh netbsd 40 wait 918ecacc
2339 2339 sh netbsd 41 wait 91d1aa4c
2235 2235 sh netbsd 41 wait 91d1a54c
2100 2100 supervise netbsd 43 poll 915d0280
2128 2128 supervise netbsd 43 0
2238 2238 supervise netbsd 43 poll 915d0280
1952 1952 supervise netbsd 43 poll 913c5340
2236 2236 multilog netbsd 43 pipe_rd 918438ac
2303 2303 svscan netbsd 43 nanoslp 918bb5c0
1884 1884 syslogd netbsd 43 kqueue 91835f38
2033 2033 mdnsd netbsd 43 select 915c6cc0
932 932 dhcpcd netbsd 43 poll 915d0280
990 990 dhcpcd netbsd 36 poll 915c6f40
879 879 dhcpcd netbsd 43 0
991 991 dhcpcd netbsd 43 poll 915c6cc0
570 570 devpubd netbsd 33 devmon 80aad06c
1 1 init netbsd 41 wait 9170e04c
0 366 system netbsd 123 physiod 9180fe04
0 218 system netbsd 43 bwfm0 9180f444
0 217 system netbsd 96 lnxcmplt 917c2a18
0 216 system netbsd 125 pooldrain 80abe900
0 215 system netbsd 124 syncer 91808200
0 214 system netbsd 126 pgdaemon 80abdcf0
0 213 system netbsd 123 data 917bdad4
0 212 system netbsd 96 semacv 80a86408
0 211 system netbsd 96 semacv 80a863f4
0 210 system netbsd 96 semacv 80a863e0
0 207 system netbsd 43 swwreboot 917dea44
0 205 system netbsd 96 sccomp 917c23dc
0 203 system netbsd 96 npfgcw 917bf144
0 202 system netbsd 222 rt_free 91709444
0 201 system netbsd 96 unpgc 80b274c8
0 200 system netbsd 222 key_timehandler 91709384
0 199 system netbsd 222 icmp6_wqinput 91707484
0 198 system netbsd 222 icmp6_wqinput 91707444
0 197 system netbsd 222 icmp6_wqinput 91707404
0 196 system netbsd 222 icmp6_wqinput 917073c4
0 195 system netbsd 96 usbevt 913cbb98
0 194 system netbsd 222 nd6_timer 916e5744
0 193 system netbsd 222 carp6_wqinput 913cad44
0 192 system netbsd 222 carp6_wqinput 913cad04
0 170 system netbsd 222 carp6_wqinput 913cacc4
0 177 system netbsd 222 carp6_wqinput 913cac84
0 174 system netbsd 222 carp_wqinput 913cabc4
0 179 system netbsd 222 carp_wqinput 913cab84
0 176 system netbsd 222 carp_wqinput 913cab44
0 31 system netbsd 222 carp_wqinput 913cab04
0 63 system netbsd 222 icmp_wqinput 913caa44
0 126 system netbsd 222 icmp_wqinput 913caa04
0 125 system netbsd 222 icmp_wqinput 913ca9c4
0 124 system netbsd 222 icmp_wqinput 913ca984
0 123 system netbsd 222 rt_timer 916e5684
0 122 system netbsd 125 vmem_rehash 915ebc84
0 121 system netbsd 43 vcmbox0 916e5444
0 120 system netbsd 96 usbtsk 80aacdac
0 119 system netbsd 96 usbtsk 80aacd8c
0 118 system netbsd 43 dwc2 916d58c4
0 117 system netbsd 221 mmctaskq 916e438c
0 116 system netbsd 221 mmctaskq 916e408c
0 107 system netbsd 43 xclocv 809e0c04
0 105 system netbsd 127 xcall 807c2448
0 104 system netbsd 223 0
0 103 system netbsd 220 0
0 102 system netbsd 221 0
0 101 system netbsd 222 0
0 > 100 system netbsd 0 0
0 99 system netbsd 127 0
0 98 system netbsd 223 0
0 97 system netbsd 220 0
0 96 system netbsd 221 0
0 30 system netbsd 222 0
0 29 system netbsd 0 0
0 28 system netbsd 127 xcall 807c15c8
0 27 system netbsd 223 0
0 26 system netbsd 220 0
0 25 system netbsd 221 0
0 24 system netbsd 222 0
0 23 system netbsd 0 0
0 22 system netbsd 43 lnxsyswq 913c5844
0 21 system netbsd 43 lnxubdwq 913c57c4
0 20 system netbsd 43 lnxpwrwq 913c5744
0 19 system netbsd 43 lnxlngwq 913c56c4
0 18 system netbsd 43 lnxhipwq 913c5644
0 17 system netbsd 43 lnxrcugc 809d7fc4
0 16 system netbsd 96 smtaskq 80aaf33c
0 15 system netbsd 43 pmfsuspend 913a2a44
0 14 system netbsd 43 pmfevent 913a2984
0 13 system netbsd 96 sopendfr 80b27484
0 12 system netbsd 222 ifwdog 913a28c4
0 11 system netbsd 222 iflnkst 913a2804
0 10 system netbsd 43 nfssilly 913a2744
0 9 system netbsd 125 vdrain 80b27f9c
0 8 system netbsd 125 mod_unld 80b1fd7c
0 7 system netbsd 127 xcall 807c0e88
0 6 system netbsd 223 0
0 5 system netbsd 220 0
0 4 system netbsd 221 0
0 3 system netbsd 222 0
0 2 system netbsd 0 0
0 0 system netbsd 125 uvm 807fa9c0
Is it normal that ps/w prints output continuously until you press Ctrl+C?
> Can you start crash(8) and stack traces from the processes not in RUN
> state, like the tstile one with `bt 0t18154'?
I tried looking at the "find" process in tstile (3770) and the
"pipeline.test" process (2534) but got the following. Is this a bug in
crash(8)?
According to htop, process 2534 was hogging 100% of one core. It looks
like it was actually spinning on the CPU?
crash> bt/t 2534
trace: pid 9524 not found
crash> bt/t 3770
trace: pid 14192 not found
> Can you run dtrace to sample what's happening?
>
> dtrace -n 'profile:::profile-97 { @[stack()] = count() }'
Next time :) I had to enable dtrace in modules.conf.
Sorry, this is probably not the most helpful answer. When this
invariably happens again, I will try the other debugging techniques.
FWIW, typing "sync" made the whole machine hang, so it could also be
storage-related. The go processes are running off USB storage, and there
is also a swap partition on the USB storage. dmesg did not contain
anything relevant though.
Would it be useful to run with a LOCKDEBUG kernel?
--
Benny
Home |
Main Index |
Thread Index |
Old Index