Subject: 3.0_BETA I/O hang
To: None <tech-kern@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 04/14/2005 12:20:40
Hi,
I have a 3.0_BETA system which went in a strange state: all access to
a partition (read or write) hangs, the process is stuck in disk wait.
I have this setup:
mooney:/#mount
/dev/raid0a on / type ffs (local)
/dev/raid1e on /usr type ffs (local)
/dev/raid1f on /graveur type ffs (local)
/dev/raid2e on /domains type ffs (local)
mfs:828 on /tmp type mfs (synchronous, local)
/dev/wd1e on /distrib type ffs (soft dependencies, NFS exported, local)
kernfs on /kern type kernfs (local)
pid103@mooney:/auto on /auto type nfs (hidden)
hera-ip6:/home/hera1 on /amd/hera-ip6/home/hera1 type nfs
tibre:/home/tibre1 on /amd/tibre/home/tibre1 type nfs
hera-ip6:/comptes on /amd/hera-ip6/comptes type nfs
The mount point causing the problem is /domains. It contains only 2 large
files (one 6GB, one 16GB). I was writing to the 16GB one (create, not
overwrite) when this happended. The process creating the file is waiting on
uvn_fp2:
mooney:/#ps axl |grep 17063
0 17063 12620 0 -18 0 80 4 uvn_fp2 DW+ ttyp2 3:34.89 /tmp/mkfile
Others are stuck on vnlock:
mooney:/#ps axl | grep vnlock
0 21601 13819 0 -2 0 76 4 vnlock DW+ ttyp3 0:00.01 ls -l
0 16715 15220 0 -2 0 936 4 vnlock DW+ ttyp5 0:00.10 -csh (tcsh)
0 21354 18277 0 -2 0 24 4 vnlock DW ttyp9 0:00.04 umount -f /do
0 21987 18277 0 -2 0 56 4 vnlock DW ttyp9 0:00.01 df -k
I can read from /dev/raid2d without problems,
so it's not the underlying device which is stuck. The box keeps running
fine, expect accesses to /domains.
Any idea what could cause this ? Anyone tried to create a file larger than
16GB already ? This filesystem uses has 32k block/4k fragment.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--