Subject: kern/23554: STABLE system locks
To: None <gnats-bugs@gnats.netbsd.org>
From: None <kefren@netbastards.org>
List: netbsd-bugs
Date: 11/24/2003 10:15:10
>Number: 23554
>Category: kern
>Synopsis: lock in -STABLE(1.6_RC1)
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Nov 24 08:16:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator: Mihai Chelaru
>Release: NetBSD 1.6.2_RC1
>Organization:
None.
>Environment:
System: NetBSD xxx.xxx.xxx 1.6.2_RC1 NetBSD 1.6.2_RC1 (Kefren) #6: Sat No
v 22 15:24:03 EET 2003 root@xxx.xxx.xxx:/usr/src/sys/arch/i38
6/compile/Kefren i386
Architecture: i386
Machine: i386
>Description:
About once per day the system locks. The only thing I can do from console is to enter ddb. This machine is doing web proxy, nat, ipsec (relativly high traffic).
Non-usual things: bumped kern.mbuf.nmbclusters to 4096(I haven't had problems of this kind with 1024 but the usual mbufs exceeded message).
Here is an output from ddb:
Stopped at cpu_Debugger+0x4: leave
db> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
4387 4376 4360 0 3 0x4084 chatel netio
4376 4373 4360 0 3 0x4084 bash wait
4375 3363 264 0 3 0x4184 sendmail select
4373 4372 4360 0 3 0x4084 bash wait
4372 1 4360 0 3 0x84 runospf nanosle
4354 264 264 0 3 0x84 cron piperd
3391 3378 3378 1004 3 0x4084 mail piperd
3389 3378 3378 1004 3 0x4084 cvsup select
3378 3370 3378 1004 3 0x4084 sh wait
3370 264 264 0 3 0x84 cron piperd
3363 264 264 0 3 0x84 cron wait
270 187 187 1003 3 0x184 httpd netio
269 1 269 0 3 0x4086 getty ttyin
268 1 268 0 3 0x4086 getty ttyin
267 1 267 0 3 0x4086 getty ttyin
266 1 266 0 2 0x4086 getty
264 1 264 0 3 0x84 cron nanosle
259 1 259 0 3 0x84 inetd select
246 1 246 0 3 0x84 sshd2 select
227 1 227 0 3 0x84 ntpd pause
215 1 10 1002 3 0x86 postNetServer netcon
214 199 10 1006 3 0x86 postgres netio
208 187 187 1003 3 0x184 httpd lockf
207 187 187 1003 3 0x184 httpd lockf
206 187 187 1003 3 0x184 httpd lockf
205 187 187 1003 3 0x184 httpd lockf
204 187 187 1003 3 0x184 httpd select
202 201 10 1006 3 0x86 postgres select
201 199 10 1006 3 0x86 postgres select
199 1 10 1006 3 0x4086 postgres select
187 1 187 0 3 0x84 httpd select
185 174 185 32767 3 0x4084 unlinkd piperd
184 174 184 32767 3 0x4084 squidGuard netio
183 174 183 32767 3 0x4084 squidGuard netio
182 174 182 32767 3 0x4084 squidGuard netio
181 174 181 32767 3 0x4084 squidGuard netio
180 174 180 32767 3 0x4084 dnsserver select
179 174 179 32767 3 0x4084 dnsserver select
178 174 178 32767 3 0x4084 dnsserver select
177 174 177 32767 3 0x4084 dnsserver select
176 174 176 32767 3 0x4084 dnsserver select
174 171 171 0 3 0x4184 squid select
171 1 171 0 3 0x84 squid wait
169 1 169 0 3 0x84 named select
167 1 167 0 3 0x84 racoon select
122 0 0 0 3 0x20204 acctwatch actwat
91 1 91 0 3 0x84 syslogd select
9 0 0 0 3 0x20204 aiodoned aiodone
8 0 0 0 3 0x20204 ioflush syncer
7 0 0 0 3 0x20204 reaper reaper
6 0 0 0 3 0x20204 pagedaemon pgdaemo
5 0 0 0 3 0x20204 pms0 pmsrese
4 0 0 0 3 0x20204 atapibus0 sccomp
3 0 0 0 3 0x20204 scsibus1 sccomp
2 0 0 0 3 0x20204 scsibus0 sccomp
1 0 1 0 3 0x4084 init wait
0 -1 0 0 3 0x20204 swapper schedul
4360 4354 4360 0 5 0x6000 sh
db> cont
Stopped at cpu_Debugger+0x4: leave
db> reboot
syncing disks... fatal page fault in supervisor mode
trap type 6 code 0 eip c01c496d cs 8 eflags 10202 cr2 fc cpl 0
panic: trap
Begin traceback...
trap() at trap+0x202
--- trap (number 6) ---
genfs_putpages(e4bd5440,c155e8c4,c01afc16,c155e8c4,e58f5c3c) at genfs_putpages+0
x239
ffs_putpages(e4bd5440,c14e4500,c1309300,e4bd5450) at ffs_putpages+0x11d
ffs_full_fsync(e4bd5538,0,e4bd548c,c01c33f8,e58f5c3c) at ffs_full_fsync+0xc6
ffs_fsync(e4bd5538,10012,10,1) at ffs_fsync+0x3c
ffs_sync(c15b5e00,2,c1309f00,c031adc0) at ffs_sync+0x10a
sys_sync(c031adc0,0,0,c01bc160,0) at sys_sync+0x5a
vfs_shutdown(0,10,e4bd560c,c0181c49,74) at vfs_shutdown+0x6a
cpu_reboot(0,0,e4bd561c,c0180a75,c02a3d00) at cpu_reboot+0x3b
db_reboot_cmd(1,0,e4bd5670,e4bd5654,0) at db_reboot_cmd+0x51
db_command(c02e5a34,c02a3d00,e4bd571c,c0180369,c02a3f8b,e4bd5718,e4bd571c,c01803
39) at db_command+0x214
db_command_loop(c022bc1c,e4bd5748,e4bd575c,c0235cb6) at db_command_loop+0x8b
db_trap(1,0,e4bd578c,c022bb46,1,0,c1b6a600,c1505098) at db_trap+0x11c
kdb_trap(1,0,e4bd57e4,c1b6a600) at kdb_trap+0x116
trap() at trap+0x177
--- trap (number 1) ---
cpu_Debugger(c14cdd40,4b0,c1652f00,c0258278,c14cdc80) at cpu_Debugger+0x4
comintr(c1304800) at comintr+0xf4
Xintr4() at Xintr4+0x7e
--- interrupt ---
ip_natout(c16bbb28,e4bd596c,e4bd596c,c16bbb00,c16bbb28) at ip_natout+0x562
fr_check(c16bbb28,14,c14d802c,1,e4bd5a38) at fr_check+0x5f7
gcc2_compiled.(0,e4bd5a38,c14d802c,2,3c) at gcc2_compiled.+0x72
pfil_run_hooks(c031f8a0,e4bd5abc,c14d802c,2,c155e948) at pfil_run_hooks+0x4c
ip_output(c16bbb00,0,c031f8c4,1,0,e4bd5c88,e4bd5cac,c01f0147,c16bbb00,0,3c,1) at
ip_output+0x708
ip_forward(c16bbb00,0,33,1,c16bbb00) at ip_forward+0x200
ip_input(c16bbb00,c02636e5,c14cdf60,e4bbf56c) at ip_input+0x3d1
ipintr(10,c17b0010,c1530010,10,e4bbf56c) at ipintr+0x6b
Xsoftnet() at Xsoftnet+0x2c
--- interrupt ---
idle(e4bbf56c,3e9,c0197948,e4bbf56c) at idle+0x1b
bpendtsleep(c031d5ac,118,c02a6b80,3e9,0) at bpendtsleep
sys_poll(e4bbf56c,e4bd5f80,e4bd5f78,c0235283) at sys_poll+0x229
syscall_plain(1f,bfbf001f,4807001f,bfbf001f,804a808) at syscall_plain+0xa7
End traceback...
dumping to dev 4,1 offset 2100389
dump 1023 1022 1021 1020 1019 1018 1017 1016 1015 1014 1013 1012 1011 1010 1009
1008 1007 1006 1005 1004 1003 1002 1001 1000 999 998 997 996 995 994 993 992 991
Here is dmesg:
NetBSD 1.6.2_RC1 (Kefren) #6: Sat Nov 22 15:24:03 EET 2003
root@xxx.xxx.xxx:/usr/src/sys/arch/i386/compile/Kefren
cpu0: Intel Pentium 4 (686-class), 1993.81 MHz
cpu0: D-cache 8 KB 64b/line 4-way
cpu0: L2 cache 512 KB 64b/line 8-way
cpu0: features bfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features bfebfbff<PGE,MCA,CMOV,FGPAT,PSE36,CFLUSH,DS,ACPI,MMX>
cpu0: features bfebfbff<FXSR,SSE,SSE2,SS,HTT,TM,B31>
total memory = 1023 MB
avail memory = 947 MB
using 6144 buffers containing 52508 KB of memory
BIOS32 rev. 0 found at 0xf0000
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: ServerWorks CMIC_LE Host (rev. 0x13)
pchb1 at pci0 dev 0 function 1
pchb1: ServerWorks CMIC_LE Host (rev. 0x00)
pci1 at pchb1 bus 128
pci1: no spaces enabled!
pchb2 at pci0 dev 0 function 2
pchb2: ServerWorks product 0x0000 (rev. 0x00)
pci2 at pchb2 bus 2
pci2: no spaces enabled!
ahc0 at pci0 dev 2 function 0
ahc0: interrupting at irq 3
ahc0: aic7899 Wide Channel A, SCSI Id=7, 16/255 SCBs
scsibus0 at ahc0: 16 targets, 8 luns per target
ahc1 at pci0 dev 2 function 1
ahc1: interrupting at irq 3
ahc1: aic7899 Wide Channel B, SCSI Id=7, 16/255 SCBs
scsibus1 at ahc1: 16 targets, 8 luns per target
vga1 at pci0 dev 3 function 0: ATI Technologies Rage XL (rev. 0x27)
pci_mem_find: void region
pci_mem_find: void region
pci_mem_find: void region
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
bge0 at pci0 dev 4 function 0: Broadcom BCM5702X Gigabit Ethernet
bge0: interrupting at irq 5
bge0: ASIC BCM5703 A2, Ethernet address 00:0b:cd:1b:8a:3f
brgphy0 at bge0 phy 1: BCM5703 1000BASE-T media interface, rev. 2
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FD
X, auto
Compaq product 0xa0f0 (miscellaneous system) at pci0 dev 5 function 0 not config
ured
pcib0 at pci0 dev 15 function 0
pcib0: ServerWorks CSB5 SouthBridge (rev. 0x93)
pciide0 at pci0 dev 15 function 1: ServerWorks CSB5 IDE Controller (rev. 0x93)
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
atapibus0 at pciide0 channel 0: 2 targets
cd0 at atapibus0 drive 0: <LTN486S, , YQSK> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4
pciide0: primary channel interrupting at irq 14
cd0(pciide0:0:0): using PIO mode 4
pciide0: secondary channel wired to compatibility mode
pciide0: secondary channel interrupting at irq 15
ServerWorks OSB4/CSB5 USB (USB serial bus, interface 0x10, revision 0x05) at pci
0 dev 15 function 2 not configured
pchb3 at pci0 dev 15 function 3
pchb3: ServerWorks product 0x0225 (rev. 0x00)
pchb4 at pci0 dev 17 function 0
pchb4: ServerWorks product 0x0101 (rev. 0x03)
pci3 at pchb4 bus 2
pci3: memory space enabled
pchb5 at pci0 dev 17 function 2
pchb5: ServerWorks product 0x0101 (rev. 0x03)
pci4 at pchb5 bus 5
pci4: memory space enabled
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 mux 1
wskbd0: connecting to wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask efcd netmask efed ttymask ffef
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <COMPAQ, BD03685A24, HPB3> SCSI3 0/direct fixed
sd0: 34732 MB, 49855 cyl, 2 head, 713 sec, 512 bytes/sect x 71132000 sectors
sd0: sync (25.0ns offset 63), 16-bit (80.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 1 lun 0: <COMPAQ, BD03685A24, HPB3> SCSI3 0/direct fixed
sd1: 34732 MB, 49855 cyl, 2 head, 713 sec, 512 bytes/sect x 71132000 sectors
sd1: sync (25.0ns offset 63), 16-bit (80.000MB/s) transfers, tagged queueing
uk0 at scsibus0 target 15 lun 0: <COMPAQ, PROLIANT 4L6I, 1.78> SCSI2 3/processor
fixed
uk0: async, 8-bit transfers
scsibus1: waiting 2 seconds for devices to settle...
IPsec: Initialized Security Association Processing.
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs
stray interrupt 7
stray interrupt 7
stray interrupt 7
stray interrupt 7
stray interrupt 7; stopped logging
IP Filter: v3.4.29 initialized. Default = pass all, Logging = enabled
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
Accounting started
Kernel config or any other info may be provided.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: