On Tue, 24 Feb 2015, BERTRAND Joël wrote:
Eduardo Horvath a écrit :
On Tue, 24 Feb 2015, BERTRAND Joël wrote:
matthew green a écrit :
Hm. From what I remember, f000xxxx is inside OBP.
that's correct :-)
Instead of randomly swapping out hardware you really should try to
diagnose the problem. I'd turn on ddb and traptrace in the kernel and
examine the contents of the traptrace buffer after the panic. That
should
tell us the sequence of traps that caused the panic.
FWIW, traptrace never was updated for SMP.
Will there a hope to quickly have a fix to obtain traptrace in syslog
? I'm trying to reproduce this bug on Blade 2000 I have at home without
any
success.
Putting traptrace back in is not trivial. It basically involves taking
all of the traptrace code that was removed in locore.s version 1.214,
enhancing it for SMP, and reinserting it into locore.s. How good are your
SPARC assembly language skills?
I haven't written sparc assembly for a very long time (and only on
sparc32...) :-(
I can try to do something, but I'm not sure I have required knowledge
to do that without help.
I can give you some advice, but I don't have the time or easy access to
the hardware to re-implement traptrace.
Take a look at the diffs between locore.s versions 1.213 and 1.214. Some
of that code needs to be added back. The first thing to do is rewrite
this TRACEIT macro:
-#define TRACEIT(tt,r3,r4,r2,r6,r7)
\
- set trap_trace, r2; \
- lduw [r2+TRACEDIS], r4; \
- brnz,pn r4, 1f; \
- lduw [r2+TRACEPTR], r3; \
- rdpr %tl, r4; \
- cmp r4, 1; \
- sllx r4, 13, r4; \
- rdpr %pil, r6; \
- or r4, %g5, r4; \
- mov %g0, %g5; \
- andncc r3, (TRACESIZ-1), %g0; /* At end of buffer? */ \
- sllx r6, 9, r6; \
- or r6, r4, r4; \
- movnz %icc, %g0, r3; /* Wrap buffer if needed */ \
- rdpr %tstate, r6; \
- rdpr %tpc, r7; \
- sth r4, [r2+r3]; \
- inc 2, r3; \
- sth %g5, [r2+r3]; \
- inc 2, r3; \
- stw r6, [r2+r3]; \
- inc 4, r3; \
- stw %sp, [r2+r3]; \
- inc 4, r3; \
- stw r7, [r2+r3]; \
- inc 4, r3; \
- mov TLB_TAG_ACCESS, r7; \
- ldxa [r7] ASI_DMMU, r7; \
- stw r7, [r2+r3]; \
- inc 4, r3; \
- stw r3, [r2+TRACEPTR]; \
-1:
What the code does is check the contents of TRACEDIS. If it's zero, it
loads TRACEPTR, writes a bunch of stuff to the buffer, and updates
TRACEPTR.
To simplify adding fields to the traptrace structure I wrote the code as a
series of stores and pointer increments. Instead of that, it needs to be
written as a single pointer increment followed by the store operations.
Then get rid of the last instruction that updates TRACEPTR, instead
creating a spinloop at the beginning that looks something like this:
- lduw [r2+TRACEPTR], r3; \
+0:
+ add r2, TRACEPTR, r4;
+ lduw [r4], r3; /* Load the offset of the next slot */
+ add r3, ENTRY_SIZE /* <- Needs to be calculated */, r6; /* Allocate */
+ cas [r4], r6, r7;
+ cmp r6, r7;
+ bne,pn %icc, 0b; /* Oops.. spin */
+ add r2, r3, r3 /* r3 now points to the entry. */
All the register+register stores ([r2+r3]) need to be rewritten as r3+constant.
After that, traceit and traceitwin should be able to use the TRACEIT
macro.
Hm. There may be some reason why I implemented traceit and traceitwin
with inline code rather than the TRACEIT macro, but I don't recall right
now.