Subject: Re: proposal for PR 27023
To: Chuck Silvers <chuq@chuq.com>
From: Andrey Petrov <petrov@netbsd.org>
List: port-sparc64
Date: 01/25/2005 13:51:33
On Tue, Jan 25, 2005 at 09:38:01AM -0800, Chuck Silvers wrote:
> hi,
> 
> here are my thoughts on PR 27023.  the problem is that SA context switches
> on both sparc and sparc64 need to write the contents of the register windows
> out to the user stack.  on a normal, non-SA context switch, if the stack
> isn't valid (perhaps because it was paged out), we'd just write the
> register windows to the PCB and go on.  this isn't good enough for SAs
> though, since the PCB is tied to a particular kernel LWP and the user
> thread may be run on a different LWP next.  so we really do need to get
> the register window contents back to the real user stack.
> 

I thought that while LWP is blocked, corresponding user-level pthread is bound
to it, and user-level scheduler can only wait till it became unblocked.
Form this point user-level scheduling is irrelevant as it handled
differently, I'd say.

> the trouble is that by the time we try to write to the user stack, we're
> already in the middle of going to sleep, and if we take a page-fault in
> this context we can end up recursing into the context-switching code,
> which is obviously non-sensical.
> 
> there are already several cases where the SA context-switching code can't
> do everything it needs to do to achieve a real SA context switch, so it
> just gives up and does a non-SA context switch instead.  I think we should
> have this new sparc case do the same thing.  the best way I've thought of
> to achieve this is to have the page-fault handler (mem_access_fault() on
> sparc, data_access_fault() on sparc64) recognize that the fault is one
> where we can't sleep, have it just fail instead of calling uvm_fault().
> "fail" in this case would be to return to the pcb_onfault address, and
> we would require that pcb_onfault be set in this case.  this would result
> in the copyout() in rwindow_save() returning some distinctive error code
> and returning that to cpu_getmcontext().  cpu_getmcontext() would
> recognize that and return it back up through a couple more functions
> to sa_switch(), which would take the error path from its call to
> sa_upcall0() and do a non-SA mi_switch().
> 

Seems right, all we lose is an upcall, and I'd agrue that in such conditions
postponed upcall is worse then no one. I might be wrong as I don't remember
if user-level scheduler depends on it.

I can think of saving/restoring the register windows in mcontext and
support that in cpu_set(get)mconcontext, but that somewhat heavy-weighted.

> 
> comments?  if there are no objections I'll go ahead and implement the above.

No objections.

	Andrey