Subject: Re: my first sparc64 panic :)
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Eduardo Horvath <eeh@turbolinux.com>
List: port-sparc
Date: 08/24/2000 10:45:44
On Thu, 24 Aug 2000, Manuel Bouyer wrote:
> No problems, I can get one in a few seconds when trying to make the tcsh
> package (in fact, it seems to happen at fork/exec time).
> ===> Extracting for tcsh-6.09.00
> trap type 0x34: pc=f12698ac npc=f1269884 pstate=ffffffff98580006<PRIV,IE>
> kernel trap 34: mem address not aligned
> Stopped in sh at pmap_enter_pv+0x19c: ldx [%o3 + 0x8], %o0
>
> db> tr
> pmap_enter(f19e8120, 27e000, 11d56000, 2, f149fcf0, 0) at pmap_enter+0x358
> uvm_fault(f1933880, 4, f149a8f0, f64382b0, 3, 278000) at uvm_fault+0xf78
> data_access_fault(6c, 27e13a, 10a688, f6445ed0, 0, 27e13a) at data_access_fault+
> 0x488
> Ldatafault_internal(2868d0, 286310, 0, 0, 0, 0) at Ldatafault_internal+0xe0
> db>
>
>
> >
> > 1) `mach tf' to get the trapframe of the fault.
>
> db> mach tf
> Trapframe 0xf146e9c0: tstate: 0x9858000603 pc: 0xf12698ac npc: 0xf1269884
> y: 0 pil: 7 oldpil: 7 fault: 0x9858000603 kstack: 0x0 tt: 34 G
> lobals:
> 0000000000000000 0000000011d4c000 0000000000000000 00000000f149af08
> 0000000000000000 0000000000000000 0000000000000021 0000000000000000
> outs:
> 0000000300000006 ffffffffffffe000 0000000000008eab 0000000400000005
Hm. 400000005 is clearly junk.
> fffffffffffffff1 00000000f1000000 00000000f6445151 00000000f1267708
> locals:
> 00000000f142dd68 00000000f1a5a008 00000000f142dc00 0000000000000000
> 00000000f141e2c4 00000000f12dba20 00000000f149a770 000000000027e000
> ins:
> 00000000f19e8120 000000000027e000 0000000011d56000 0000000000000000
> 0000000011d509e0 00000000000009f8 00000000f6445211 00000000f1267288
> db>
>
> >
> > 2) check curproc's p_vmstate to make sure it has a correct pmap pointer.
>
> I assume you mean p_vmspace. Is this what we get from VMSPACE with
> "show all proc /a" ?
>
> db> show all proc /a
> PID COMMAND STRUCT PROC * UAREA * VMSPACE/VM_MAP
> 321 sh 0xf642e290 0xf6454000 0xf5e469b0
> 320 sh 0xf642e510 0xf644c000 0xf5e47930
> 319 sh 0xf642e010 0xf6446000 0xf5e47170
> >318 sh 0xf642e790 0xf6442000 0xf5e47740
> 317 sh 0xf642ec90 0xf643e000 0xf5e46f80
> 310 make 0xf642ea10 0xf643a000 0xf5e47b20
> 309 sh 0xf5e3db80 0xf6434000 0xf5e47360
> 292 make 0xf5e3cf00 0xf6430000 0xf5e46d90
> 291 sh 0xf5e3d900 0xf6428000 0xf5e47550
> 194 make 0xf5e3cc80 0xf6420000 0xf5e463e0
> 178 csh 0xf5e3ca00 0xf5e62000 0xf5e461f0
> 176 cron 0xf5e3d680 0xf6416000 0xf5e46ba0
> 173 inetd 0xf5e3d400 0xf6412000 0xf5e465d0
> 98 syslogd 0xf5e3d180 0xf63fc000 0xf5e467c0
> 4 ioflush 0xf5e3c780 0xf5e54000 0xf1472648
> 3 reaper 0xf5e3c500 0xf5e50000 0xf1472648
> 2 pagedaemon 0xf5e3c280 0xf5e4c000 0xf1472648
> 1 init 0xf5e3c000 0xf5e38000 0xf5e46000
> 0 swapper 0xf1472848 0xf1802000 0xf1472648
>
> I assume curproc is PID 318
>
> db> show map /f 0xf5e47740
> MAP 0xf5e47740: [0x0->0xf1000000]
> #ent=7, sz=269565952, ref=1, version=5, flags=0x1
> pmap=0xf19e8120(resident=10)
The pmap pointer seems valid in this case.
> - 0xf641ebb0: 0x100000->0x180000: obj=0xf5e49740/0x0, amap=0x0/0
> submap=F, cow=T, nc=T, prot(max)=5/7, inh=1, wc=0, adv=0
> - 0xf641efd0: 0x200000->0x280000: obj=0xf5e49740/0x0, amap=0xf64382b0/0
> submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
> - 0xf641f9f0: 0x280000->0x288000: obj=0x0/0x0, amap=0xf5e5f9d0/0
> submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
> - 0xf641f510: 0x288000->0x292000: obj=0x0/0x0, amap=0xf5e5fc00/0
> submap=F, cow=T, nc=F, prot(max)=7/7, inh=1, wc=0, adv=0
> - 0xf641ec10: 0x10200000->0x10202000: obj=0x0/0x0, amap=0xf64388d0/0
> submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
> - 0xf641fbd0: 0xe1000000->0xf0f80000: obj=0x0/0x0, amap=0x0/0
> submap=F, cow=T, nc=T, prot(max)=0/7, inh=1, wc=0, adv=0
> - 0xf641e8b0: 0xf0f80000->0xf1000000: obj=0x0/0x0, amap=0xf5e5fce0/0
> submap=F, cow=T, nc=F, prot(max)=7/7, inh=1, wc=0, adv=0
> db>
>
> >
> > 3) if you can figure out the address of the original fault (possibly from
> > `mach tf /u') you can use `mach pv <page>' to dump the pv_list for that
> > page.
>
> db> mach tf /u
> Trapframe 0xf6445ed0: tstate: 0x800008206 pc: 0x10a688 npc: 0x10a68c
> y: 0 pil: 0 oldpil: 0 fault: 0x27e13a kstack: 0x0 tt: 6c Globals:
>
> 0000000000000000 0000000000000002 0000000000000010 000000000017d86f
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> outs:
> 00000000002868d0 0000000000286310 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 00000000f0ffeaa1 0000000000000000
> locals:
> 000000000027e978 000000000028e000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> ins:
> 0000000000286800 00000000002868d0 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 00000000f0ffeb61 000000000010a76c
>
> Would the address be pc or npc ?
> I don't know sparc64 well enouth for this.
This is a little complicated. The faulting address (or what the locore.s
thinks is the faulting address) is in the `fault:' field, in this case
0x27e13a. This is probably a userland VA. Hm. It was a protection
fault (0x6c). I think the only way to get the page is to take the va
(0x27e13a), dump the submap that contains it, and see if there's a page
for it.
Eduardo Horvath