NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/59175: posix_spawn hang, hanging other process too



>Number:         59175
>Category:       kern
>Synopsis:       posix_spawn hang, hanging other process too
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 13 21:30:00 +0000 2025
>Originator:     Thomas Klausner
>Release:        NetBSD 10.99.12
>Organization:

>Environment:
	
	
Architecture: x86_64
Machine: amd64
>Description:
sysutils/htop didn't exit, eating 100% CPU, and was not killable, not even
with -9, not even as root.

It seems to be related to second process, a ninja process that's dying
and hanging in posix_spawn.

wiz           11928 99.3  0.0    15720   2484 pts/7- ON   12:38nachm. 559:26.17 htop
2000          12110  0.0  0.0    15720   2484 ?      R    12:38nachm. 559:26.17 (ninja)

htop's backtrace looks nearly always like this:

crash> trace/t 2E98
trace: pid 11928 lid 11928 at 0xffffc1a49a32f8d0
pmap_update() at pmap_update+0x1b
uvm_fault_upper_enter() at uvm_fault_upper_enter+0x1d2
uvm_fault_internal() at uvm_fault_internal+0xa95
trap() at trap+0x457
--- trap (number 6) ---
kcopy() at kcopy+0x15
sysctl_kern_proc_args() at sysctl_kern_proc_args+0x46f
sysctl_dispatch() at sysctl_dispatch+0xae
sys___sysctl() at sys___sysctl+0xc5
syscall() at syscall+0x112
--- syscall (number 202) ---
syscall+0x112:

The ninja process information:

crash> show proc 0t12110
ninja: pid 12110 proc ffffa7781a4bb4c0 vmspace/map ffffa780efa93900 flags 4000
  lwp 12110 ffffa77c23c78800 pcb ffffc1a48c5c0000
    stat 3 flags 0 cpu 13 pri 26 ref 0
    wmesg pspawn wchan ffffa77e5d16e270
(gdb) print *(struct proc *)0xffffa7781a4bb4c0 $2 = {p_list = {le_next = 0xffffa7735bb983c0, le_prev = 0xffffa7652b39fb40}, p_lock = 0xffffa76f1e9837c0, p_waitcv = {cv_opaque = {0x0, 0xffffffff8141e952}}, p_lwpcv = {cv_opaque = {0x0, 0xffffffff81413c3c}}, p_cred = 0xffffa780f8e9f040, p_fd = 0xffffa77384cb2b00, p_cwdi = 0xffffa7797ee61400, p_stats = 0xffffa775cad8f280, p_limit = 0xffffa7765cfecb80, p_vmspace = 0xffffa780efa93900, p_sigacts = 0xffffa770ad48d310, p_aio = 0x0, p_mqueue_cnt = 0, p_specdataref = {specdataref_container = 0x0, specdataref_lock = {u = {mtxa_owner = 0, s = {mtxs_dummy = 0 '\000', mtxs_ipl = {_ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}}, p_exitsig = 20, p_flag = 16384, p_sflag = 0, p_stflag = 0, p_slflag = 0, p_stat = 2 '\002', p_lflag = 0 '\000', p_trace_enabled = 0 '\000', p_pad1 = "\000\000", p_pid = 12110, p_pglist = {le_next = 0xffffa77795303c40, le_prev = 0xffffa773f5c1d9c0}, p_pptr = 0xffffa7735bb98040, p_sibling = {le_next = 0xffff
 a7652b39fb40, le_prev = 0xffffa7735bb98118}, p_children = { lh_first = 0xffffa7764380c400}, p_lwps = {lh_first = 0xffffa77c23c78800}, p_raslist = 0x0, p_nlwps = 1, p_nzlwps = 0, p_nrlwps = 1, p_nlwpwait = 0, p_ndlwps = 0, p_nstopchild = 15, p_waited = 0, p_zomblwp = 0x0, p_vforklwp = 0x0, p_sched_info = 0x0, p_estcpu = 0, p_estcpu_inherited = 36864, p_forktime = 348042, p_pctcpu = 0, p_opptr = 0x0, p_timers = 0x0, p_rtime = {sec = 0, frac = 0}, p_uticks = 97, p_sticks = 53, p_iticks = 0, p_xutime = 1121300, p_xstime = 607370, p_traceflag = 0, p_tracep = 0x0, p_textvp = 0xffffa76612be4540, p_emul = 0xffffffff8188afc0 <emul_netbsd>, p_emuldata = 0x0, p_execsw = 0xffffffff81887560 <exec_elf64_execsw>, p_klist = {slh_first = 0x0}, p_sigwaiters = {lh_first = 0x0}, p_sigpend = {sp_info = {tqh_first = 0x0, tqh_last = 0xffffa7781a4bb680}, sp_set = {__bits = {0, 0, 0, 0}}}, p_lwpctl = 0x0, p_ppid = 1, p_oppid = 27349, p_path = 0xffffa76fcf38cec0 "/usr/pkg/bin/ninja", p_sigctx = {ps_info = {_
 signo = 0, _code = 0, _errno = 0, _pad = 0, _reason = {_rt = {_pid = 0, _uid = 0, _value = {sival_int = 0, sival_ptr = 0x0}}, _child = {_pid = 0, _uid = 0, _status = 0, _utime = 0, _stime = 0}, _fault = {_addr = 0x0, _trap = 0, _trap2 = 0, _trap3 = 0}, _poll = {_band = 0, _fd = 0}, _syscall = {_sysnum = 0, _retval = {0, 0}, _error = 0, _args = {0, 0, 0, 0, 0, 0, 0, 0}}, _ptrace_state = {_pe_report_event = 0, _option = {_pe_other_pid = 0, _pe_lwp = 0}}}}, ps_lwp = 0, ps_faked = false, ps_sigcode = 0x0, ps_sigignore = {__bits = {2554888192, 0, 0, 0}}, ps_sigcatch = {__bits = {16387, 0, 0, 0}}, ps_sigpass = {__bits = {0, 0, 0, 0}}}, p_nice = 20 '\024', p_comm = "ninja\000build\000\000\000\000\000", p_pgrp = 0xffffa773f5c1d9c0, p_psstrp = 140187723116512, p_pax = 3, p_xexit = 0, p_xsig = 0, p_acflag = 0, p_md = {md_flags = 0, md_syscall = 0xffffffff805b7504 <syscall>}, p_stackbase = 140187720364032, p_dtrace = 0xffffa76726712b80, p_auxlock = {u = {mtxa_owner = 0, s = {mtxs_dummy = 0 '\0
 00', mtxs_ipl = {_ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, p_stmutex = {u = {mtxa_owner = 2049, s = {mtxs_dummy = 1 '\001', mtxs_ipl = {_ipl = 8 '\b'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, p_reflock = { rw_owner = 18446646750350903300}}
crash> bt/t 0t12110
trace: pid 12110 lid 12110 at 0xffffc1a48c5c4e20
sleepq_block() at sleepq_block+0xee
cv_wait() at cv_wait+0xd4
do_posix_spawn() at do_posix_spawn+0x660
sys_posix_spawn() at sys_posix_spawn+0x1f2
syscall() at syscall+0x112
--- syscall (number 474) ---
syscall+0x112:

>How-To-Repeat:
Start htop, get unlucky.
>Fix:
Fix posix_spawn().

>Unformatted:
 	
 	


Home | Main Index | Thread Index | Old Index