Subject: Re: Some more 'D' evidence.
To: None <thorpej@nas.nasa.gov>
From: Anders Magnusson <ragge@ludd.luth.se>
List: port-sparc
Date: 04/08/1996 18:28:45
>
> On Sun, 7 Apr 1996 10:45:34 -0400 (EDT)
> David Gilbert <dgilbert@jaywon.pci.on.ca> wrote:
>
> > I have found that if I get a process stuck in 'D', then I can
> > produce more such processes by attempting to read the directory in
> > which that process is attempting to read files. I have been able to
> > duplicate this where I had compiles running which I could definately
> > track down.
>
> Smells like a vnode locking problem, maybe? That could cause the hangs
> (not really hangs, but enough like one to call it that) people have
> observed at reboot (after the "syncing disks...").
>
> > Interestingly, nfsd processes will also get stuck in this
> > manner. I don't know if this is interesting, or not.
>
> Hmm, though, I could swear this came up before NFSv3 went in, but I could
> be wrong...
>
We have had a couple of very similar problems. On a rather loaded
mailserver (> 8000 mail/day) here running NetBSD/sparc, sendmail
sometimes get hanging waiting on lockf, and is unable to send some
mail. (It is the outgoing mailqueue) It is on a NFS-exported
mailpartition, and the problem came up when NFSv3 were imported into
the tree. Our solution was to compile a sendmail that uses dot-locking
instead of lockf.
-- Ragge