NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/58871: Stuck processes
>Number: 58871
>Category: kern
>Synopsis: Stuck processes
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Dec 03 19:55:00 +0000 2024
>Originator: Benny Siegert
>Release: NetBSD 10_STABLE from 2024-11-17
>Organization:
The NetBSD Foundation
>Environment:
NetBSD rpi3.bentsukun.ch 10.0_STABLE NetBSD 10.0_STABLE (GENERIC) #0: Sun Nov 17 16:18:04 UTC 2024 mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile/GENERIC evbarm
>Description:
I have two machines running evbarm-earmv7hf as Go CI builders. Both regularly get into a wedged state. I think this mostly happens during the golang.org/x/tools tests.
When this happens, you can still log into the machine with ssh and execute commands. "sudo reboot" however does nothing, otherwise the CI infra would get out of this state -- it tries rebooting when a test has problems.
Right now, the top output looks like this:
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
16334 swarming 95 0 601M 189M RUN/2 1:57 56.40% 56.40% completion.test
1598 swarming 95 0 600M 178M RUN/2 0:30 44.74% 43.46% misc.test
16304 swarming 63 0 720M 86M RUN/2 2:31 35.25% 35.25% diagnostics.test
18154 swarming 127 0 593M 44M tstile/1 0:12 11.81% 11.47% modfile.test
These percentages have not changed in several minutes. None of these tasks can be killed, even with "kill -9".
This happens on two machines, a Raspberry Pi 3 and an OrangePi Plus 2E.
>How-To-Repeat:
git clone https://go.googlesource.com/tools go-tools
cd go-tools
go test ./... # may have to run multiple times
>Fix:
Any ideas on how to debug this further? If I attach gdb to one of these processes, it hangs.
Home |
Main Index |
Thread Index |
Old Index