Port-alpha archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: can't reboot after running a 5.0 kernel
On Sun, Jan 30, 2011 at 12:23:25PM +0100, Martin Husemann wrote:
> I'll have to dig through old PRs and (unfortunately not very clear) commit
> messages.
>
> Can you please file a PR? We should make up our mind if cpu_setfunc is
> supposed to call lwp_startup() or not, make sure all ports do it consistently,
> and find out if the SA compat code needs fixing (e.g. arrange for a call to
> lwp_startup by other means). Or backout all the changes to various ports
> that introduced the separate trampoline.
Hi-
I filed a PR entitled:
4.0 sa threaded apps hard hang netbsd-5 and HEAD kernels on some ports
[cpu_setfunc() related]
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44500
For this.
I got some additional information in the process...
First, the hard hang occurs in mi_switch() in the following loop (I
added the debug printf):
/*
* We may need to spin-wait for if 'newl' is still
* context switching on another CPU.
*/
if (newl->l_ctxswtch != 0) {
u_int count;
count = SPINLOCK_BACKOFF_MIN;
while (newl->l_ctxswtch) {
SPINLOCK_BACKOFF(count);
printf("POINTA\n"); /*XXXCDC*/
}
}
it just prints "POINTA" endlessly --- it never exits that loop. Note
my system only has one CPU (so the case the comment is looking for does
not apply). Because interrupts are disabled, it is not possible to break
to DDB if you are stuck in that while() loop, your system is hung (that's
why you have to power cycle).
I also did a survey of some of the ports in the tree, and it looks
some port's cpu_setfunc() still call lwp_startup() while other ports
have been modified (like the alpha) to not call it:
arch cpu_setfunc calls does it call lpw_startup? when changed?
------- ---------------------- ----------------------------------------
acorn26 lwp_trampoline yes
alpha setfunc_trampoline no (vm_machdep.1.100, 2009/06/01)
arm32 lwp_trampoline yes
hppa setfunc_trampoline no (vm_machdep.c 1.36, 2009/06/03)
m68k setfunc_trampoline no (vm_machdep.c 1.28, 2009/05/30)
mips setfunc_trampoline no (vm_machdep.c 1.123, 2009/05/30)
powerpc setfunc_trampoline no (vm_machdep.c 1.77, 2009/06/07)
sh3 lwp_setfunc_trampoline no (never called lpw_startup?)
sparc lwp_setfunc_trampoline no (vm_machdep.c 1.100, 2009/05/29)
sparc64 lwp_setfunc_trampoline no (vm_machep.c 1.89, 2009/05/30)
x86 lwp_trampoline yes
the "no" ports are likely to have problems with compat_sa binaries,
I think.
The most interesting one is the sh3 (because it didn't get the change
in 2009) and the commit comment from mrg on the sparc (because it
is the earliest instance of this change --- 2009/05/29):
----------------------------
revision 1.100
date: 2009/05/29 22:06:56; author: mrg; state: Exp; lines: +11 -5
fix up cpu_setfunc() as noted by uwe:
- don't call lwp_startup for cpu_setfunc() users
- introduce lwp_setfunc_trampoline instead
- no need to set the "new" lwp for setfunc
----------------------------
But I couldn't find where mrg said that uwe@netbsd noted it.
chuck
Home |
Main Index |
Thread Index |
Old Index