Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
[RESEND] User-level window trap when booting NetBSD kernel under QEMU SPARC64
Hi all,
I'm one of the QEMU SPARC/OpenBIOS maintainers and I've been spending my
time over the past few weeks (and possibly longer!) working on patches
so that NetBSD kernels will boot under QEMU SPARC64.
I've made some good progress recently, however I'm a still experiencing
a user trap during boot which I don't understand. I've had some previous
correspondence with Martin on this, but it requires a deep-level
understanding as to how the SPARC64 memory management code works so I
was hoping that you'd be able to provide some help with this.
So far I have a set of patches for OpenBIOS which get my 6.1.2 ISO image
to boot to this point:
build@kentang:~/rel-qemu-git/bin$ ./qemu-system-sparc64 -cdrom
/home/build/src/qemu/image/sparc64/NetBSD-6.1.2-sparc64.iso -bios
/home/build/src/openbios/openbios-git/openbios-devel/obj-sparc64/openbios-builtin.elf.nostrip
-boot d -nographic
OpenBIOS for Sparc64
Configuration device id QEMU version 1 machine id 0
kernel cmdline
CPUs: 1 x SUNW,UltraSPARC-IIi
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on May 12 2014 21:33
Type 'help' for detailed information
Trying cdrom:f...
Not a bootable ELF image
Not a bootable a.out image
Loading FCode image...
Loaded 7478 bytes
entry point is 0x4000
NetBSD IEEE 1275 Multi-FS Bootblock
Version $NetBSD: bootblk.fth,v 1.13 2010/06/24 00:54:12 eeh Exp $
..
Jumping to entry point 0000000000100000 for type 0000000000000001...
switching to new context: entry point 0x100000 stack 0x00000000ffe8aa09
>> NetBSD/sparc64 OpenFirmware Boot, Revision 1.16
=0x8870a0
Loading netbsd: 8071888+553056+339856 [601032+393301]=0x9cd528
Unimplemented service set-symbol-lookup ([2] -- [0])
Unexpected client interface exception: -1
1 tt=30 tstate=4482000605 tpc=0x14984f4 tnpc=0x14984f8
2 tt=30 tstate=4411001503 tpc=0x1001804 tnpc=0x1001808
3 tt=c0 tstate=4482001604 tpc=0x10094f4 tnpc=0x135fbc8
Stopped in pid 0.1 (system) at 1008528: nop
db{0}>
The problem is that I'm getting a data_access_exception on the first
window fill trap executed after the kernel takes over the trap table
with SUNW,set-trap-table.
Here is the gdb session showing the openfirmware() function after the
NetBSD kernel has called SUNW,set-trap-table:
(gdb) disas 0x1009478, 0x10094f8
Dump of assembler code from 0x1009478 to 0x10094f8:
0x0000000001009478: sethi %hi(0x1800000), %o4
0x000000000100947c: btst 1, %sp
0x0000000001009480: be %icc, 0x10094f8
0x0000000001009484: ldx [ %o4 ], %o4
0x0000000001009488: save %sp, -176, %sp
0x000000000100948c: rdpr %pil, %i2
0x0000000001009490: mov 0xf, %i3
0x0000000001009494: cmp %i3, %i2
0x0000000001009498: movle %icc, %i2, %i3
0x000000000100949c: wrpr %g0, %i3, %pil
0x00000000010094a0: mov %i0, %o0
0x00000000010094a4: mov %g1, %l1
0x00000000010094a8: mov %g2, %l2
0x00000000010094ac: mov %g3, %l3
0x00000000010094b0: mov %g4, %l4
0x00000000010094b4: mov %g5, %l5
0x00000000010094b8: mov %g6, %l6
0x00000000010094bc: mov %g7, %l7
0x00000000010094c0: rdpr %pstate, %l0
0x00000000010094c4: call %i4
0x00000000010094c8: wrpr 6, %pstate
=> 0x00000000010094cc: wrpr %l0, %pstate
0x00000000010094d0: mov %l1, %g1
0x00000000010094d4: mov %l2, %g2
0x00000000010094d8: mov %l3, %g3
0x00000000010094dc: mov %l4, %g4
0x00000000010094e0: mov %l5, %g5
0x00000000010094e4: mov %l6, %g6
0x00000000010094e8: mov %l7, %g7
0x00000000010094ec: wrpr %i2, 0, %pil
0x00000000010094f0: ret
0x00000000010094f4: restore %o0, %g0, %o0
End of assembler dump.
(gdb) info regi
g0 0x0 0
g1 0x1 1
g2 0x7e50000 132448256
g3 0x18d1c00 26024960
g4 0x1ae8000 28213248
g5 0x1000 4096
g6 0x0 0
g7 0x0 0
o0 0x0 0
o1 0x1 1
o2 0xfffffffffffffff8 -8
o3 0xffffffff00000000 -4294967296
o4 0x1c14230 29442608
o5 0x1000000 16777216
sp 0x1c054a1 0x1c054a1
o7 0x10094c4 16815300
l0 0x16 22
l1 0x1 1
l2 0x7e50000 132448256
l3 0x18d1c00 26024960
l4 0x1ae8000 28213248
l5 0x1000 4096
l6 0x0 0
l7 0x0 0
i0 0x1c05e00 29384192
i1 0x7e50000 132448256
i2 0xd 13
i3 0xf 15
i4 0xffd0fe60 4291886688
i5 0x18d1800 26023936
fp 0x1c05551 0x1c05551
i7 0x135fbc0 20315072
pc 0x10094cc 0x10094cc
npc 0x10094d0 0x10094d0
state 0x4482000604 294238815748
fsr 0x0 [ ]
fprs 0x4 [ FEF ]
y 0x0 0
cwp 0x4 4
pstate 0x6 [ IE PRIV ]
asi 0x82 130
ccr 0x44 68
(gdb)
The MMU TLB entries look like this:
QEMU 2.0.50 monitor - type 'help' for more information
(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
[00] VA: ffe00000, PA: 7f00000, 512k, priv, RW, locked, ctx 0 local
[01] VA: ffe80000, PA: 7f80000, 512k, priv, RW, locked, ctx 0 local
[02] VA: ffd00000, PA: 1fff0000000, 512k, priv, RO, locked, ctx 0 local
[03] VA: ffd80000, PA: 1fff0080000, 512k, priv, RO, locked, ctx 0 local
[04] VA: ffc80000, PA: 7e80000, 512k, priv, RW, locked, ctx 0 local
[05] VA: 4000, PA: 4000, 8k, priv, RW, unlocked, ctx 0 local
[06] VA: 6000, PA: 6000, 8k, priv, RW, unlocked, ctx 0 local
[07] VA: 8000, PA: 8000, 8k, priv, RW, unlocked, ctx 0 local
[08] VA: a000, PA: a000, 8k, priv, RW, unlocked, ctx 0 local
[09] VA: c000, PA: c000, 8k, priv, RW, unlocked, ctx 0 local
[10] VA: e000, PA: e000, 8k, priv, RW, unlocked, ctx 0 local
[11] VA: 10000, PA: 10000, 8k, priv, RW, unlocked, ctx 0 local
[12] VA: 12000, PA: 12000, 8k, priv, RW, unlocked, ctx 0 local
[13] VA: 14000, PA: 14000, 8k, priv, RW, unlocked, ctx 0 local
[14] VA: 16000, PA: 16000, 8k, priv, RW, unlocked, ctx 0 local
[15] VA: 18000, PA: 18000, 8k, priv, RW, unlocked, ctx 0 local
[16] VA: 1a000, PA: 1a000, 8k, priv, RW, unlocked, ctx 0 local
[17] VA: 100000, PA: 100000, 8k, priv, RW, unlocked, ctx 0 local
[18] VA: 102000, PA: 102000, 8k, priv, RW, unlocked, ctx 0 local
[19] VA: 104000, PA: 104000, 8k, priv, RW, unlocked, ctx 0 local
[20] VA: 106000, PA: 106000, 8k, priv, RW, unlocked, ctx 0 local
[21] VA: 108000, PA: 108000, 8k, priv, RW, unlocked, ctx 0 local
[22] VA: 10a000, PA: 10a000, 8k, priv, RW, unlocked, ctx 0 local
[23] VA: 10c000, PA: 10c000, 8k, priv, RW, unlocked, ctx 0 local
[24] VA: 10e000, PA: 10e000, 8k, priv, RW, unlocked, ctx 0 local
[25] VA: 110000, PA: 110000, 8k, priv, RW, unlocked, ctx 0 local
[26] VA: 112000, PA: 112000, 8k, priv, RW, unlocked, ctx 0 local
[27] VA: 114000, PA: 114000, 8k, priv, RW, unlocked, ctx 0 local
[28] VA: ffc7e000, PA: 7e7e000, 8k, priv, RW, unlocked, ctx 0 local
[29] VA: ffc7a000, PA: 7e7a000, 8k, priv, RW, unlocked, ctx 0 local
[30] VA: ffc7c000, PA: 7e7c000, 8k, priv, RW, unlocked, ctx 0 local
[31] VA: ffc78000, PA: 7e78000, 8k, priv, RW, unlocked, ctx 0 local
[32] VA: ffc76000, PA: 7e76000, 8k, priv, RW, unlocked, ctx 0 local
[33] VA: ffc72000, PA: 7e72000, 8k, priv, RW, unlocked, ctx 0 local
[34] VA: ffc70000, PA: 7e70000, 8k, priv, RW, unlocked, ctx 0 local
[35] VA: ffc6e000, PA: 7e6e000, 8k, priv, RW, unlocked, ctx 0 local
[36] VA: ffc64000, PA: 7e64000, 8k, priv, RW, unlocked, ctx 0 local
[37] VA: ffc66000, PA: 7e66000, 8k, priv, RW, unlocked, ctx 0 local
[38] VA: ffc68000, PA: 7e68000, 8k, priv, RW, unlocked, ctx 0 local
[39] VA: ffc6a000, PA: 7e6a000, 8k, priv, RW, unlocked, ctx 0 local
[40] VA: ffc6c000, PA: 7e6c000, 8k, priv, RW, unlocked, ctx 0 local
[41] VA: ffc62000, PA: 7e62000, 8k, priv, RW, unlocked, ctx 0 local
[42] VA: 1000000, PA: 7800000, 4M, priv, RO, locked, ctx 0 local
[43] VA: 1400000, PA: 7400000, 4M, priv, RO, locked, ctx 0 local
[44] VA: 1800000, PA: 7000000, 4M, priv, RW, locked, ctx 0 local
[45] VA: ffc60000, PA: 7e60000, 8k, priv, RW, unlocked, ctx 0 local
[46] VA: 7ffc000, PA: 7e5c000, 8k, priv, RW, unlocked, ctx 0 local
[47] VA: 7ffe000, PA: 7e5e000, 8k, priv, RW, unlocked, ctx 0 local
[48] VA: 7ffa000, PA: 7e5a000, 8k, priv, RW, unlocked, ctx 0 local
[49] VA: 1c0c000, PA: 7e40000, 8k, priv, RW, unlocked, ctx 0 local
[50] VA: 1c0e000, PA: 7e42000, 8k, priv, RW, unlocked, ctx 0 local
[51] VA: 1c10000, PA: 7e44000, 8k, priv, RW, unlocked, ctx 0 local
[52] VA: 1c12000, PA: 7e46000, 8k, priv, RW, unlocked, ctx 0 local
[53] VA: 1c14000, PA: 7e48000, 8k, priv, RW, unlocked, ctx 0 local
[54] VA: 1c16000, PA: 7e4a000, 8k, priv, RW, unlocked, ctx 0 local
[55] VA: 1c18000, PA: 7e4c000, 8k, priv, RW, unlocked, ctx 0 local
[56] VA: 1c1a000, PA: 7e4e000, 8k, priv, RW, unlocked, ctx 0 local
[57] VA: e0010000, PA: 7e40000, 64k, priv, RW, locked, ctx 0 local
[58] VA: 1c04000, PA: 14000, 8k, priv, RW, unlocked, ctx 0 local
IMMU dump
[00] VA: ffd00000, PA: 1fff0000000, 512k, priv, locked, ctx 0 local
[01] VA: ffc80000, PA: 7e80000, 512k, priv, locked, ctx 0 local
[02] VA: 100000, PA: 100000, 8k, priv, unlocked, ctx 0 local
[03] VA: 102000, PA: 102000, 8k, priv, unlocked, ctx 0 local
[04] VA: 10a000, PA: 10a000, 8k, priv, unlocked, ctx 0 local
[05] VA: 10c000, PA: 10c000, 8k, priv, unlocked, ctx 0 local
[06] VA: 110000, PA: 110000, 8k, priv, unlocked, ctx 0 local
[07] VA: 104000, PA: 104000, 8k, priv, unlocked, ctx 0 local
[08] VA: 108000, PA: 108000, 8k, priv, unlocked, ctx 0 local
[09] VA: 10e000, PA: 10e000, 8k, priv, unlocked, ctx 0 local
[10] VA: 106000, PA: 106000, 8k, priv, unlocked, ctx 0 local
[11] VA: 1000000, PA: 7800000, 4M, priv, locked, ctx 0 local
[12] VA: 1400000, PA: 7400000, 4M, priv, locked, ctx 0 local
(qemu)
As soon as I hit the restore at 0x10094f4 in gdb, I get a fill_0_normal
trap which vectors to 0x1001800:
(gdb) disas 0x1001800, 0x100184c
Dump of assembler code from 0x1001800 to 0x100184c:
=> 0x0000000001001800: wr %g0, 0x11, %asi
0x0000000001001804: ldxa [ %sp + 0x7ff ] %asi, %l0
0x0000000001001808: ldxa [ %sp + 0x807 ] %asi, %l1
0x000000000100180c: ldxa [ %sp + 0x80f ] %asi, %l2
0x0000000001001810: ldxa [ %sp + 0x817 ] %asi, %l3
0x0000000001001814: ldxa [ %sp + 0x81f ] %asi, %l4
0x0000000001001818: ldxa [ %sp + 0x827 ] %asi, %l5
0x000000000100181c: ldxa [ %sp + 0x82f ] %asi, %l6
0x0000000001001820: ldxa [ %sp + 0x837 ] %asi, %l7
0x0000000001001824: ldxa [ %sp + 0x83f ] %asi, %i0
0x0000000001001828: ldxa [ %sp + 0x847 ] %asi, %i1
0x000000000100182c: ldxa [ %sp + 0x84f ] %asi, %i2
0x0000000001001830: ldxa [ %sp + 0x857 ] %asi, %i3
0x0000000001001834: ldxa [ %sp + 0x85f ] %asi, %i4
0x0000000001001838: ldxa [ %sp + 0x867 ] %asi, %i5
0x000000000100183c: ldxa [ %sp + 0x86f ] %asi, %fp
0x0000000001001840: ldxa [ %sp + 0x877 ] %asi, %i7
0x0000000001001844: restored
0x0000000001001848: retry
End of assembler dump.
(gdb) info regi
g0 0x0 0
g1 0x1f61ec8c2 8424179906
g2 0x1f60f8682 8423179906
g3 0xffe11df8 4292943352
g4 0x0 0
g5 0x0 0
g6 0x0 0
g7 0x0 0
o0 0x1c05e00 29384192
o1 0x7e50000 132448256
o2 0xd 13
o3 0xf 15
o4 0xffd0fe60 4291886688
o5 0x18d1800 26023936
sp 0x1c05551 0x1c05551
o7 0x135fbc0 20315072
l0 0xffffffffffe30c38 -1897416
l1 0xffe8ac38 4293438520
l2 0x17500f0 24445168
l3 0x1746c78 24407160
l4 0x1816400 25256960
l5 0x18c0800 25954304
l6 0x18c0800 25954304
l7 0x19cd570 27055472
i0 0xa 10
i1 0xffe8b0f0 4293439728
i2 0x20 32
i3 0xffd0fe60 4291886688
i4 0x17502f8 24445688
i5 0x0 0
fp 0xffe85219 0xffe85219
i7 0xffd0a988 4291864968
pc 0x1001800 0x1001800
npc 0x1001804 0x1001804
state 0x4482001503 294238819587
fsr 0x0 [ ]
fprs 0x4 [ FEF ]
y 0x0 0
cwp 0x3 3
pstate 0x15 [ AG PRIV PEF ]
asi 0x82 130
ccr 0x44 68
(gdb)
From here you can see that %sp is 0x1c05551, so the first access at %sp
+ 0x7ff bias = 0x1c05d50 which is mapped just before the call to
SUNW,set-trap-table. But because the access is made using ASI 0x11 which
is a user ASI then the fill_0_normal invokes a further
data_access_exception trap, which takes roughly the following path:
-> trap 0x30, data_access_exception (0x1004600)
-> winfault: 0x00000000010081cc
http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1721
#1737 we did previously take a datafault, so go to winfixfill
-> winfixfill: 0x000000000100822c
http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1756
#1770: we are in PRIV mode, so carry on
#1819: not at trap level 3, so invoke software trap 1 (0x101)
Trap 0x101 invokes the panic/debugger
This shows that the 0x101 is being invoked deliberately because a kernel
mapping is being accessed by a user ASI while the processor is in
PSTATE.PRIV == 1 mode.
AFAICT the basic logic looks correct, so I am wondering if anyone can
comment as to what should happen on real hardware? My current thoughts
are that the initial fill_0_normal trap is incorrect, and instead a
supervisor fill trap should be used instead but I can't quite understand
how this is supposed to happen.
If anyone has any ideas as to why this is happening and/or what the
intended behaviour is then I would be very interested to try and
understand the memory management algorithms. And of course, when it all
works then you get the warm feeling of being able to add a SPARC64
machine to your buildfarm!
If you've made it this far, then thank you for your time and I look
forward to hearing from you further
ATB,
Mark.
Home |
Main Index |
Thread Index |
Old Index