Hello, As there is a regression in recent -10.x kernel with iscsi initiator, I have built a 10.99.12 kernel. iscsi initiator runs now as expected. I haven't seen hard lock for a while. Nevertheless, system randomly panics with : [ 347098.708000] ccb_timeout: num=1 total=0 disp=0 invalid ccb=0xffff860021cec3e0 [ 417509.006761] panic: cpu0: softints stuck for 16 seconds [ 417509.006761] cpu0: Begin traceback... [ 417509.006761] vpanic() at netbsd:vpanic+0x171 [ 417509.006761] panic() at netbsd:panic+0x3c [ 417509.006761] heartbeat() at netbsd:heartbeat+0x34c [ 417509.006761] hardclock() at netbsd:hardclock+0x8b [ 417509.006761] Xresume_lapic_ltimer() at netbsd:Xresume_lapic_ltimer+0x1e [ 417509.006761] --- interrupt ---[ 347098.708000] ccb_timeout: num=1 total=0 disp=0 invalid ccb=0xffff860021cec3e0 [ 417509.006761] panic: cpu0: softints stuck for 16 seconds [ 417509.006761] cpu0: Begin traceback... [ 417509.006761] vpanic() at netbsd:vpanic+0x171 [ 417509.006761] panic() at netbsd:panic+0x3c [ 417509.006761] heartbeat() at netbsd:heartbeat+0x34c [ 417509.006761] hardclock() at netbsd:hardclock+0x8b [ 417509.006761] Xresume_lapic_ltimer() at netbsd:Xresume_lapic_ltimer+0x1e [ 417509.006761] --- interrupt --- [ 417509.006761] mutex_vector_enter() at netbsd:mutex_vector_enter+0x354 [ 417509.006761] pool_get() at netbsd:pool_get+0x69 [ 417509.006761] pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x136 [ 417509.006761] pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x254 [ 417509.006761] m_get() at netbsd:m_get+0x36 [ 417509.006761] m_gethdr() at netbsd:m_gethdr+0x9 [ 417509.006761] tcp_output() at netbsd:tcp_output+0x1284 [ 417509.006761] tcp_input() at netbsd:tcp_input+0xeec [ 417509.006761] ipintr() at netbsd:ipintr+0x88c [ 417509.006761] softint_dispatch() at netbsd:softint_dispatch+0x112 [ 417509.006761] DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffff8604a84ad0f0 [ 417509.006761] Xsoftintr() at netbsd:Xsoftintr+0x4c [ 417509.006761] --- interrupt --- [ 417509.006761] 4edca1a77f5082bf: [ 417509.006761] cpu0: End traceback... [ 417509.006761] dumping to dev 18,1 (offset=253015, size=4162677): [ 417509.006761] dump device bad [ 417509.006761] mutex_vector_enter() at netbsd:mutex_vector_enter+0x354 [ 417509.006761] pool_get() at netbsd:pool_get+0x69 [ 417509.006761] pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x136 [ 417509.006761] pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x254 [ 417509.006761] m_get() at netbsd:m_get+0x36 [ 417509.006761] m_gethdr() at netbsd:m_gethdr+0x9 [ 417509.006761] tcp_output() at netbsd:tcp_output+0x1284 [ 417509.006761] tcp_input() at netbsd:tcp_input+0xeec [ 417509.006761] ipintr() at netbsd:ipintr+0x88c [ 417509.006761] softint_dispatch() at netbsd:softint_dispatch+0x112 [ 417509.006761] DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffff8604a84ad0f0 [ 417509.006761] Xsoftintr() at netbsd:Xsoftintr+0x4c [ 417509.006761] --- interrupt --- [ 417509.006761] 4edca1a77f5082bf: [ 417509.006761] cpu0: End traceback... [ 417509.006761] dumping to dev 18,1 (offset=253015, size=4162677): [ 417509.006761] dump device bad I have checked, always the same panic message. I would have more information, but as you see, "dump device 'is' bad". For a while, my system is not able to write kernel dump. With -9, it worked. Never with -10 (but without message). -10.99 complains against bad device. My swap is on raid0b. Maybe kernel cannot write a dump on a raid slice. Best regards, JB
Attachment:
signature.asc
Description: OpenPGP digital signature