Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Ultrasparc III+ kernel panic
On Wed, 25 Feb 2015, BERTRAND Joël wrote:
> BERTRAND Joël a écrit :
> > Eduardo Horvath a écrit :
> > > On Tue, 24 Feb 2015, BERTRAND Joël wrote:
> > >
> > > > matthew green a écrit :
> > > > > > Hm. From what I remember, f000xxxx is inside OBP.
> > > > >
> > > > > that's correct :-)
> > > > >
> > > > > > Instead of randomly swapping out hardware you really should try to
> > > > > > diagnose the problem. I'd turn on ddb and traptrace in the kernel
> > > > > > and
> > > > > > examine the contents of the traptrace buffer after the panic. That
> > > > > > should
> > > > > > tell us the sequence of traps that caused the panic.
> > > > >
> > > > > FWIW, traptrace never was updated for SMP.
> > > > >
> > > >
> > > > Will there a hope to quickly have a fix to obtain traptrace in
> > > > syslog
> > > > ? I'm trying to reproduce this bug on Blade 2000 I have at home
> > > > without any
> > > > success.
> > >
> > > Putting traptrace back in is not trivial. It basically involves taking
> > > all of the traptrace code that was removed in locore.s version 1.214,
> > > enhancing it for SMP, and reinserting it into locore.s. How good are
> > > your
> > > SPARC assembly language skills?
> >
> > I haven't written sparc assembly for a very long time (and only on
> > sparc32...) :-(
> >
> > I can try to do something, but I'm not sure I have required
> > knowledge to do that without help.
> >
> > Best regards,
> >
> > JKB
>
> Another one :
>
> Feb 25 13:03:33 legendre /netbsd: trap type 0x34: cpu 0,
> pc=f0008380text_access_fault: pc=5ac99cd8 va=5ac98000
> Feb 25 13:03:33 legendre /netbsd: npc=f0008384
> pstate=0xffffffff88820006<PRIV,IE>
> Feb 25 13:03:33 legendre /netbsd: Skipping crash dump on recursive panic
> Feb 25 13:03:33 legendre /netbsd: panic: kernel fault
> Feb 25 13:03:33 legendre /netbsd: cpu1: Begin traceback...
> Feb 25 13:03:33 legendre /netbsd: cpu1: End traceback...
> Feb 25 13:03:33 legendre /netbsd: cpu0: shutting down
> Feb 25 13:03:33 legendre /netbsd: cpu1: rebooting
> Feb 25 13:03:33 legendre /netbsd:
>
> If I remember, trap 34 is triggered when kernel tries to access to
> unaligned memory. I have found on mailing list archive some messages about
> trap 34 in ipfilter and I use on system that often panics ipfilter :
Yes:
#define T_ALIGN 0x034 /* (10) address not properly aligned */
Here's the code in trap.c that's faulting:
{
char sb[sizeof(PSTATE_BITS) + 64];
printf("trap type 0x%x: cpu %d, pc=%lx",
type, cpu_number(), pc);
snprintb(sb, sizeof(sb), PSTATE_BITS,
pstate);
printf(" npc=%lx pstate=%s\n",
(long)tf->tf_npc, sb);
DEBUGGER(type, tf);
panic("%s", type < N_TRAP_TYPES ?
trap_type[type] : T);
}
The first printf succeeds, but the snprintb() does not and somehow it
generates a text access fault... unless the text fault is on another CPU.
You might want to add the CPU# to the text_access_fault spew.
Eduardo
Home |
Main Index |
Thread Index |
Old Index