Subject: Netscape - the plot thickens
To: None <port-sparc@NetBSD.ORG>
From: Greg Earle <earle@isolar.Tujunga.CA.US>
List: port-sparc
Date: 11/15/1995 00:05:58
Well, I've gotten a little bit further. Thanks to a timely hint from Theo,
I instrumented the remaining SIGILLs in trap.c. I also tried it in machdep.c
(and even in fpu.c, just to cover all the bases), just to see if I could
catch everything.
The weird thing is that somehow, most of the SIGILLs are still getting
through, undetected. I have no idea how this is happening.
The good news, though, is that a few times, I was able to catch it at work:
netscape-2.0beta2[pid 212]: T_RWRET read_rw failed: pc=3a44fc npc=3a4500
psr=90001081<EF,S>
This comes from instrumenting this snippet of trap():
...
#define read_rw(src, dst) \
copyin((caddr_t)(src), (caddr_t)(dst), sizeof(struct rwindow))
case T_RWRET:
/*
* T_RWRET is a window load needed in order to rett.
* It simply needs the window to which tf->tf_out[6]
* (%sp) points. There are no user or saved windows now.
* Copy the one from %sp into pcb->pcb_rw[0] and set
* nsaved to -1. If we decide to deliver a signal on
* our way out, we will clear nsaved.
*/
if (pcb->pcb_uw || pcb->pcb_nsaved) panic("trap T_RWRET 1");
if (rwindow_debug)
printf("%s[%d]: rwindow: pcb<-stack: %x\n", p->p_comm, p->p_pid, tf->tf_out[6]); if (read_rw(tf->tf_out[6], &pcb->pcb_rw[0]))
sigexit(p, SIGILL);
The above debug message was inserted right before the sigexit(p, SIGILL).
I unstripped a copy of the SunOS binary so I could get some symbols, and
took a look:
...
_PR_Start+0xa0: add %o2, 0x1, %o1
_PR_Start+0xa4: mov %o1, %o2
_PR_Start+0xa8: st %o2, [%o0 + 0x60]
_PR_Start+0xac: ld [%fp + 0x44], %o0
_PR_Start+0xb0: add %o0, 0x58, %o1
_PR_Start+0xb4: mov %o1, %o0
_PR_Start+0xb8: call _setjmp
_PR_Start+0xbc: nop
_PR_Start+0xc0: tst %o0 ! == PC
_PR_Start+0xc4: be _PR_Start + 0xd4 ! == NPC
_PR_Start+0xc8: nop
_PR_Start+0xcc: call _HopToadNoArgs
_PR_Start+0xd0: nop
This smells like a longjmp() returning and the restoration of the register
windows getting botched somehow? Is there any more debugging info I should
be trying to print out?
Thanks to Theo and Charles for their suggestions.
- Greg