NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/58871: Stuck processes



The following reply was made to PR kern/58871; it has been noted by GNATS.

From: Benny Siegert <bsiegert%gmail.com@localhost>
To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost,
 gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/58871: Stuck processes
Date: Tue, 31 Dec 2024 21:50:35 +0100

 Am 28.12.24 um 16:13 schrieb Taylor R Campbell:
 >> Did that for the one process that was in tstile, the "find" one.
 > 
 > Thanks, that's pretty weird: find is waiting for a lock but nobody is
 > holding it!
 > 
 >    PID   LID          COMMAND      WAITING-FOR     WAIT-CHANNEL
 > 11160 11160             find                0         95e9ea40
 
 I have another hanging find process. Again, this is the only process in 
 tstile, same symptoms.
 
 Here's a backtrace of where it is stuck:
 
 db{0}> trace/t 0t15932
 trace: pid 15932 lid 15932 at 0xbced4b3c
 0xbced4b3c: netbsd:mi_switch+0xc
 0xbced4b64: netbsd:sleepq_block+0xac
 0xbced4bd4: netbsd:turnstile_block+0x32c
 0xbced4c44: netbsd:rw_enter+0x148
 0xbced4c5c: netbsd:genfs_lock+0x7c
 0xbced4c9c: netbsd:VOP_LOCK+0x84
 0xbced4cb4: netbsd:vn_lock+0x18
 0xbced4d9c: netbsd:namei_tryemulroot.constprop.0+0x6f0
 0xbced4dcc: netbsd:namei+0x58
 0xbced4e34: netbsd:do_sys_statat+0xd0
 0xbced4eec: netbsd:sys___lstat50+0x2c
 0xbced4fac: netbsd:syscall+0x188
 
 
 > Just to be absolutely sure, can you send output of the following on
 > your kernel?
 > 
 > $ ident /netbsd | grep -F cpuswitch.S
 
 This is the NetBSD 10.1 release, so:
 
       $NetBSD: cpuswitch.S,v 1.105.12.1 2023/07/31 13:36:32 martin Exp $
 
 > And can you share the disassembly of the function cpu_switchto in your
 > kernel, with objdump or gdb?
 
 
 (gdb) disassemble cpu_switchto
 Dump of assembler code for function cpu_switchto:
     0x8009ea7c <+0>:     mov     r12, sp
     0x8009ea80 <+4>:     push    {r4, r5, r6, r7, r12, lr}
     0x8009ea84 <+8>:     mov     r6, r1
     0x8009ea88 <+12>:    mov     r4, r0
     0x8009ea8c <+16>:    ldr     r5, [r6]
     0x8009ea90 <+20>:    ldr     r7, [r4, #32]
     0x8009ea94 <+24>:    strd    r8, [r7]
     0x8009ea98 <+28>:    strd    r10, [r7, #8]
     0x8009ea9c <+32>:    strd    r12, [r7, #16]
     0x8009eaa0 <+36>:    mrc     15, 0, r0, cr13, cr0, {2}
     0x8009eaa4 <+40>:    str     r0, [r7, #32]
     0x8009eaa8 <+44>:    cpsid   i
     0x8009eaac <+48>:    mcr     15, 0, r6, cr13, cr0, {4}
     0x8009eab0 <+52>:    dmb     sy
     0x8009eab4 <+56>:    str     r6, [r5, #896]  ; 0x380
     0x8009eab8 <+60>:    dmb     sy
     0x8009eabc <+64>:    ldr     r7, [r6, #32]
     0x8009eac0 <+68>:    ldr     sp, [r7, #20]
     0x8009eac4 <+72>:    cpsie   i
     0x8009eac8 <+76>:    ldr     r0, [r6, #80]   ; 0x50
     0x8009eacc <+80>:    tst     r0, #512        ; 0x200
     0x8009ead0 <+84>:    bne     0x8009eb18 <cpu_switchto+156>
     0x8009ead4 <+88>:    ldr     r0, [r7, #32]
     0x8009ead8 <+92>:    mcr     15, 0, r0, cr13, cr0, {2}
     0x8009eadc <+96>:    ldr     r0, [r6, #504]  ; 0x1f8
     0x8009eae0 <+100>:   mcr     15, 0, r0, cr13, cr0, {3}
     0x8009eae4 <+104>:   ldr     r0, [r5, #1068] ; 0x42c
     0x8009eae8 <+108>:   cmp     r0, #0
     0x8009eaec <+112>:   ldrne   r0, [r7, #48]   ; 0x30
     0x8009eaf0 <+116>:   vmsrne  fpexc, r0
     0x8009eaf4 <+120>:   ldr     r0, [r6, #308]  ; 0x134
     0x8009eaf8 <+124>:   ldr     r2, [r0, #132]  ; 0x84
     0x8009eafc <+128>:   cmp     r2, #0
     0x8009eb00 <+132>:   beq     0x8009eb18 <cpu_switchto+156>
     0x8009eb04 <+136>:   ldr     r8, [r6, #36]   ; 0x24
     0x8009eb08 <+140>:   ldr     r1, [r8, #76]   ; 0x4c
     0x8009eb0c <+144>:   bl      0x8043c368 <ras_lookup>
     0x8009eb10 <+148>:   cmn     r0, #1
     0x8009eb14 <+152>:   strne   r0, [r8, #76]   ; 0x4c
     0x8009eb18 <+156>:   ldrd    r8, [r7]
     0x8009eb1c <+160>:   ldrd    r10, [r7, #8]
     0x8009eb20 <+164>:   ldr     r12, [r7, #16]
     0x8009eb24 <+168>:   mov     r0, r4
     0x8009eb28 <+172>:   mov     r1, r6
     0x8009eb2c <+176>:   clrex
     0x8009eb30 <+180>:   pop     {r4, r5, r6, r7, r12, pc}
 End of assembler dump.
 
 > Also, if you have netbsd.gdb for the kernel, can you force a crash
 > dump (enter ddb and run `sync'), so we can see the arguments and
 > locals in the stack trace?  Curious to see what file find is stuck on.
 
 Is netbsd.gdb part of the release? I haven't found it on cdn.netbsd.org. 
 The "debug" set contains a file /usr/libdata/debug/netbsd-GENERIC.debug, 
 but that's all I have.
 
 -- 
 Benny
 


Home | Main Index | Thread Index | Old Index