Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: illegal instruction in kernel at boot



On Wed, Sep 25, 2024 at 06:20:35PM +0200, Manuel Bouyer wrote:
> Hello
> I tried booting the latest netbsd-10 image on a brand new Poweredge
> R750 server with 2 Intel(R) Xeon(R) Silver 4310 CPU (Model 106 Stepping 6).
> It panics with:
> [   1.0077245] fatal privileged instruction fault in supervisor mode
> [   1.0077245] trap type 0 code 0 rip 0xffffffff8023c25e cs 0x8 rflags 0x10246 cr2 0 ilevel 0x6 rsp 0xffffffff81d4eab8
> [   1.0077245] curlwp 0xffffffff8188ad00 pid 0.0 lowest kstack 0xffffffff81d492c0
> kernel: privileged instruction fault trap, code=0
> Stopped in pid 0.0 (system) at  netbsd:xrstor+0xa:      fxsavel
> xrstor() at netbsd:xrstor+0xa
> aes_selftest() at netbsd:aes_selftest+0x26
> aes_modcmd() at netbsd:aes_modcmd+0xe9
> module_do_builtin() at netbsd:module_do_builtin+0x142
> module_do_builtin() at netbsd:module_do_builtin+0xfa
> module_init_class() at netbsd:module_init_class+0x142
> main() at netbsd:main+0x493
> ds          5510
> es          bd41
> fs          bd41
> gs          ab9f
> rdi         ffffffff81007d80    safe_fpu.1
> rsi         2e7
> rbp         ffffffff81d4eb00
> rbx         ffffffff8130fb48    C.7+0x48
> rdx         0
> rcx         70
> rax         2e7
> r8          70
> r9          ffffffff81d4eb10
> r10         0
> r11         0
> r12         ffffffff818493e0    aes_ni_impl
> r13         20
> r14         3c
> r15         0
> rip         ffffffff8023c25e    xrstor+0xa
> cs          8
> rflags      10246
> rsp         ffffffff81d4eab8
> ss          10
> netbsd:xrstor+0xa:      fxsavel
> 
> It looks like we're using aes_ni_impl:
> [...]

I made some progress.  Actually it's in aesni_probe, and more specifically
when it calls fpu_area_restore() from fpu_kern_enter().
cr4 is 0x678 and xsave_feature is 0x2e7 at fpu_area_restore() entry
so CR4_OSFXSR is there.
I tried disabling FPU_SAVE_XSAVEOPT, keeping only FPU_SAVE_XSAVE but this
didn't help. Disabling XSAVE completely avoids the illegal instruction
(which means that fxrstor is working) but a NULL function pointer
cause a panic at a later point.

I suspect that, for this CPU some more bits needs to be enabled in one of the
config registers, or our fxsavel is called with inappropriate arguments
(x86_xsave_feature would have a wrong value). I'm not familiar with x86 FPU
so some help is welcome

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index