Subject: port-sparc64/19196: filesystem deadlock (lfs over nfs)
To: None <gnats-bugs@gnats.netbsd.org>
From: Lubomir Sedlacik <salo@Xtrmntr.org>
List: netbsd-bugs
Date: 11/28/2002 23:33:11
>Number: 19196
>Category: port-sparc64
>Synopsis: filesystem deadlock (lfs over nfs)
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: port-sparc64-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Nov 28 14:34:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: Lubomir Sedlacik
>Release: NetBSD 1.6K 20021127
>Organization:
>Environment:
>Description:
machine hung when updating cvs tree over nfs. underlying filesystem is lfs.
nfs is unresponsive, console is hung, i can ping it (no other services are
running).
the system was cross-build on i386, if that could indicate some problems. the
machine is still in ddb and i can leave it there for few hours for further
investigation if someone will respond quickly. otherwise i'll try to build
system directly on sparc64 and try it again.
db> ps /n
PID PPID PGRP UID S FLAGS COMMAND WAIT
355 1 355 0 3 0x4006 getty getnewb
328 327 327 0 3 0x4 lfs_cleanerd getnewb
327 1 327 0 3 0x84 lfs_cleanerd wait
310 1 310 0 3 0x84 mountd select
147 140 140 0 3 0x4 nfsd vnlock
146 140 140 0 3 0x4 nfsd vnlock
145 140 140 0 3 0x4 nfsd vnlock
144 140 140 0 3 0x4 nfsd getnewb
140 1 140 0 3 0x84 nfsd select
108 1 108 0 3 0x84 rpcbind select
95 1 95 0 3 0x4 syslogd getnewb
6 0 0 0 3 0x20204 aiodoned aiodone
5 0 0 0 3 0x20204 ioflush lfs seg
4 0 0 0 3 0x20204 reaper reaper
3 0 0 0 3 0x20204 pagedaemon pgdaemo
2 0 0 0 3 0x20204 scsibus0 sccomp
1 0 1 0 3 0x4084 init wait
0 -1 0 0 3 0x20204 swapper schedul
db> show uvmexp
Current UVM status:
pagesize=8192 (0x2000), pagemask=0x1fff, pageshift=13
10872 VM pages: 1036 active, 73 inactive, 44 wired, 7809 free
min 10% (25) anon, 10% (25) file, 5% (12) exec
max 80% (204) anon, 50% (128) file, 30% (76) exec
pages 1649 anon, 5 file, 165 exec
freemin=32, free-target=42, inactive-target=3115, wired-max=3624
faults=3206494, traps=1542060, intrs=28890934, ctxswitch=2784039
softint=0, syscalls=5224181, swapins=101, swapouts=101
fault counts:
noram=0, noanon=0, pgwait=0, pgrele=0
ok relocks(total)=276530(276530), anget(retrys)=12144(0), amapcopy=3581
neighbor anon/obj pg=5684/135923, gets(lock/unlock)=304406/276530
cases: anon=8782, anoncow=3362, obj=302007, prcopy=2399, przero=1211476
daemon and swap counts:
woke=4326, revs=4326, scans=801265, obscans=621417, anscans=0
busy=357, freed=0, reactivate=145609, deactivate=629768
pageouts=0, pending=0, nswget=0
nswapdev=1, nanon=75911, nanonneeded=75911 nfreeanon=74972
swpages=65759, swpginuse=0, swpgonly=0 paging=0
disklabel:
type: unknown
disk: BSD
label:
flags:
bytes/sector: 512
sectors/track: 133
tracks/cylinder: 27
sectors/cylinder: 3591
cylinders: 4924
total sectors: 17682084
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize cpg/sgs]
a: 2100735 0 4.2BSD 1024 8192 16 # (Cyl. 0 - 584)
b: 1052163 2100735 swap # (Cyl. 585 - 877)
c: 17682084 0 unused 0 0 # (Cyl. 0 - 4923)
d: 1048572 3152898 4.2BSD 1024 8192 16 # (Cyl. 878 - 1169)
e: 4194288 4201470 4.4LFS 1024 8192 7 # (Cyl. 1170 - 2337)
f: 9286326 8395758 4.2BSD 1024 8192 16 # (Cyl. 2338 - 4923)
/etc/fstab:
/dev/sd0a / ffs rw,softdep 1 2
/dev/sd0b none swap sw 0 0
/dev/sd0d /var ffs rw,softdep 1 1
/dev/sd0e /cvs lfs rw 1 1
/dev/sd0f /pub ffs rw,softdep 1 1
dmesg:
Boot device: disk0 File and args:
NetBSD IEEE 1275 Bootblock
..>> NetBSD/sparc64 OpenFirmware Boot, Revision 1.6
>How-To-Repeat:
try to stress lfs exported over nfs from sparc64 machine (unconfirmed yet).
>Fix:
none provided.
>Release-Note:
>Audit-Trail:
>Unformatted:
>> (salo@otaku, Wed Nov 27 17:32:01 CET 2002)
loadfile: reading header
elf64_exec: Booting /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a/netbsd
4454312@0x1000000+140472@0x1800000+4053832@0x18224b8
symbols @ 0xfef7e340 90+342696+182268 start=0x1000000
chain: calling OF_chain(800000, e4b0, 1000000, fffb5a80, 18)
[ using 525896 bytes of netbsd ELF symbol table ]
console is /sbus@1f,0/zs@f,1100000:a
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 1.6K (GENERIC) #0: Wed Nov 27 19:14:56 CET 2002
salo@otaku:/opt/src/obj/sys/arch/sparc64/compile/GENERIC
total memory = 98304 KB
avail memory = 80840 KB
using 627 buffers containing 5016 KB of memory
bootpath: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0
mainbus0 (root): SUNW,Ultra-1
cpu0 at mainbus0: SUNW,UltraSPARC @ 166.989 MHz, version 0 FPU
cpu0: 32K instruction (32 b/l), 16K data (32 b/l), 512K external (64 b/l)
timer0 at mainbus0 addr 0xfffc7c00 irq vectors 7f0 and 7f1
sbus0 at mainbus0 addr 0xfffcc000: clock = 25 MHz
DVMA map: ff800000 to ffffe000
IOTSB: 776000 to 778000
audiocs0 at sbus0 slot 13 offset 0xc000000 vector 24 ipl 13: CS4231A
audio0 at audiocs0: full duplex
auxio0 at sbus0 slot 15 offset 0x1900000
flashprom at sbus0 slot 15 offset 0x0 not configured
SUNW,fdtwo at sbus0 slot 15 offset 0x1400000 vector 29 ipl 11 not configured
clock0 at sbus0 slot 15 offset 0x1200000: mk48t59: hostid 8086f6f0
zs0 at sbus0 slot 15 offset 0x1100000 vector 28 ipl 12 softpri 6
zstty0 at zs0 channel 0 (console i/o)
zstty1 at zs0 channel 1
zs1 at sbus0 slot 15 offset 0x1000000 vector 28 ipl 12 softpri 6
zstty2 at zs1 channel 0
kbd0 at zstty2
zstty3 at zs1 channel 1
ms0 at zstty3
sc at sbus0 slot 15 offset 0x1300000 not configured
SUNW,pll at sbus0 slot 15 offset 0x1304000 not configured
dma0 at sbus0 slot 14 offset 0x8400000: dma rev 2
esp0 at dma0 slot 14 offset 0x8800000 vector 20 ipl 3: ESP200, 40MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
ledma0 at sbus0 slot 14 offset 0x8400010: dma rev 2
le0 at ledma0 slot 14 offset 0x8c00000 vector 21 ipl 6: address 08:00:20:86:f6:f0
le0: 8 receive buffers, 2 transmit buffers
bpp0 at sbus0 slot 14 offset 0xc800000 vector 22 ipl 2: dma rev 2
cgsix0 at sbus0 slot 2 offset 0x0 vector 5 ipl 5: SUNW,501-2325, 1152 x 900, rev 11
cgsix0: attached to /dev/fb
pcons at mainbus0 not configured
Kernelized RAIDframe activated
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <IBM, DDRS39130SUN9.0G, S98E> disk fixed
sd0: 8637 MB, 4926 cyl, 27 head, 133 sec, 512 bytes/sect x 17689267 sectors
sd0: sync (100.0ns offset 15), 8-bit (10.000MB/s) transfers
sd1 at scsibus0 target 1 lun 0: <IBM, DCAS32160SUN2.1G, S65A> disk fixed
sd1: 2063 MB, 8188 cyl, 3 head, 172 sec, 512 bytes/sect x 4226725 sectors
sd1: sync (100.0ns offset 15), 8-bit (10.000MB/s) transfers
cd0 at scsibus0 target 6 lun 0: <TOSHIBA, XM5701TASUN12XCD, 0997> cdrom removable
cd0: sync (100.0ns offset 8), 8-bit (10.000MB/s) transfers
root on sd0a dumps on sd0b
root file system type: ffs