re: Is pmax alive?

To: matthew green <mrg%eterna.com.au@localhost>
Subject: re: Is pmax alive?
From: "Maciej W. Rozycki" <macro%linux-mips.org@localhost>
Date: Sat, 7 May 2016 23:23:10 +0100 (BST)

On Sun, 8 May 2016, matthew green wrote:

> can you diagnose the addresses in the trap message?  eg, the above
> post has these lines:
> 
> > pid 0(system): trap: cpu0, address error (load or I-fetch) in kernel mode
> > status=0x80010, cause=0x30000010, epc=0x8000001c, vaddr=0xdeadbeef
> > tf=0x8056cce8 ksp=0x8056cd88 ra=0x800b0120 ppl=0
> 
> can you open the kernel in gdb and try to find out where 0x8000001c
> is in the kernel object?  also 0x800b0120.

 It won't be in the kernel image, I suppose, not at least directly -- 
0x8000001c is within the TLB Refill exception handler.  I'd expect it to 
be only installed at the run time.  Even if you track the assembly 
fragment down, it won't tell you anything anyway, because you've got a 
nested exception here -- you need to track down what has caused the TLB 
Refill exception in the first place and not the Address Error exception 
within.  Unfortunately in the R3000 processor a nested exception 
overwrites the EPC register, so the original value has been lost.

 So I think tracing the cause from $ra (0x800b0120) will be more 
productive -- there are three possibilities:

1. This is in code called from $ra-8 -- you can check what the JAL or BAL 
   instruction there calls.

2. This is in code after a return to $ra -- you can check what follows.

3. This is in code reached via a sibling call aka tail jump -- tough!

Fortunately sibling calls are not that common, and you can rebuild code 
asking the compiler not to produce them (-fno-optimize-sibling-calls).

> also, the vaddr=0xdeadbeef can only come from a handful of places
> in the pmax kernel.  none of them are in MD code, these are the
> only ones i can see it could be:
> 
> sys/uvm/uvm_page.c:     pg->uobject = (void *)0xdeadbeef;
> sys/uvm/uvm_page.c:     pg->uanon = (void *)0xdeadbeef;
> sys/uvm/uvm_pglist.c:           pg->uobject = (void *)0xdeadbeef;
> sys/uvm/uvm_pglist.c:           pg->uanon = (void *)0xdeadbeef;
> 
> could you try adjusting each of these to see if they are it?

 That would explain it a bit -- the TLB Refill exception triggered for an 
address mapping to a page directory which has not been set up.  This is 
early on, so no surprise virtual mappings don't work.  But you need to 
track down the place in the kernel which has triggered this exception to 
find out more.

 Actually as a debug hack I think you could enable the FPU (i.e. set the 
Status.CU1 bit) earlier on, then, in the TLB Refill exception, stash the 
original EPC away to an FPR before further processing, and finally dump it 
in Address Error exception.  That would be a faster way to track the 
origin down than starting from $ra, I think.

 NB I'm not a NetBSD expert -- this is a generic analysis based solely on 
the MIPS architecture and information provided here.  Hope this helps 
anyway.

  Maciej

Follow-Ups:
- Re: Is pmax alive?
  - From: Felix Deichmann

References:
- re: Is pmax alive?
  - From: matthew green

Prev by Date: re: Is pmax alive?
Next by Date: Re: Is pmax alive?
Previous by Thread: re: Is pmax alive?
Next by Thread: Re: Is pmax alive?
Indexes:

Home | Main Index | Thread Index | Old Index