Subject: Re: lock bug in getnewvnode, or uvm_km_kmemalloc/uvm_map ?
To: Andrew Brown <atatat@atatdot.net>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 11/21/2002 11:50:47
On Thu, 21 Nov 2002, Andrew Brown wrote:
> >I'm confused. How does that relate to not wanting to sleep with raidframe
> >locks held?
>
> this all started like this:
>
> >trying to backport the raidrame lock fixes to 1.6, I ran into a panic
> >at raid autoconfig time, because pool_get(PR_WAITOK) is called with a
> >simple_lock held.
> >
> >The call trace is:
> >getnewvnode() (the simple_lock is grabbed in getnewvnode)
> >pool_get(PR_NOWAIT)
> >pool_allocator_alloc()
> >pool_page_alloc_nointr()
> >uvm_km_alloc_poolpage1()
> >uvm_km_kmemalloc()
> >uvm_map()
> >pool_get(PR_WAITOK)
> >
> >The bug is, although pool_get() was initially called PR_NOWAIT, the pool
> >memory allocator will call pool_get(PR_WAITOK). This is triggered by
> >raidframe and getnewvnode(), but I think it's not specific to it, and
> >I think it's still present in current.
> >
> >uvm_km_kmemalloc() is properly called with UVM_KMF_NOWAIT.
> >But uvm_map() is called with flags stripped to UVM_KMF_TRYLOCK, so
> >UVM_KMF_NOWAIT is lost here.
> >Second problem, uvm_mapent_alloc() called from uvm_map() will call
> >pool_get(PR_WAITOK) unconditionally. It seems to me that uvm_map() was not
> >designed to be called from a context where we can sleep, but I may be
> >wrong.
>
> i can't say for sure whether uvm_map() was designed to be called from
> a context where we can sleep, but these two things struck me when i
> saw manuel's patch.
>
> (1) amap_extend() is only called in process context, meaning that we
> *can* sleep.
Uhm, no. pool_get() was called with PR_NOWAIT set. Doesn't matter if we
have process around, we can't sleep. :-)
> (2) raid autoconfig takes place (correct me if i'm wrong here) at the
> end of the boot process, but before init is started, so it's *not* a
> process context.
We do have processes at that point, though. There is some magic so that
riadframe can spawn processes, yet we get init to still be process 1.
Jason did it.
> i was trying to say that the two things are mutually exclusive, and
> that manuel should not expend the effort to "fix" amap_extend() for a
> context in which it will never be used.
The problem though is that there are times, even when we're in full
multi-user, when we make calls that can't sleep.
Take care,
Bill