Subject: Re: 3.0_BETA I/O hang
To: Bill Studenmund <wrstuden@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 04/15/2005 15:02:24
On Thu, Apr 14, 2005 at 09:36:25AM -0700, Bill Studenmund wrote:
> On Thu, Apr 14, 2005 at 12:20:40PM +0200, Manuel Bouyer wrote:
> >
> > The mount point causing the problem is /domains. It contains only 2 large
> > files (one 6GB, one 16GB). I was writing to the 16GB one (create, not
> > overwrite) when this happended. The process creating the file is waiting on
> > uvn_fp2:
>
> I think that's the main culprit. I think the other nodes are all pilled up
> on it in one way or another.
Yes, most probably.
>
> > mooney:/#ps axl |grep 17063
> > 0 17063 12620 0 -18 0 80 4 uvn_fp2 DW+ ttyp2 3:34.89 /tmp/mkfile
> > Others are stuck on vnlock:
> > mooney:/#ps axl | grep vnlock
> > 0 21601 13819 0 -2 0 76 4 vnlock DW+ ttyp3 0:00.01 ls -l
> > 0 16715 15220 0 -2 0 936 4 vnlock DW+ ttyp5 0:00.10 -csh (tcsh)
> > 0 21354 18277 0 -2 0 24 4 vnlock DW ttyp9 0:00.04 umount -f /do
> > 0 21987 18277 0 -2 0 56 4 vnlock DW ttyp9 0:00.01 df -k
> >
> > I can read from /dev/raid2d without problems,
> > so it's not the underlying device which is stuck. The box keeps running
> > fine, expect accesses to /domains.
> >
> > Any idea what could cause this ? Anyone tried to create a file larger than
> > 16GB already ? This filesystem uses has 32k block/4k fragment.
>
> No, but I've seen this on occasion. For me, it happens when I'm getting a
> crash dump of a multi-threaded app I'm working on. Sometimes crash dumps
> tickle it, sometimes they don't. I have had days where every core works,
> and days where every core does this.
>
> The problem is that a page has been marked busy, and genfs_getpages is
> waiting for it to unbusy. I expect that either we lost an unbusy, or we
> somehow or another already have busied the pages and thus are deadlocked
> on ourself.
My (very simple) application use O_WRONLY | O_SYNC, if that matters. Do core
dumping also use synchronous I/O ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--