NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
re: kern/57737: netbsd-10 panics on current Epyc CPU
The following reply was made to PR kern/57737; it has been noted by GNATS.
From: matthew green <mrg%eterna23.net@localhost>
To: gnats-bugs%netbsd.org@localhost, hf%spg.tu-darmstadt.de@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost
Subject: re: kern/57737: netbsd-10 panics on current Epyc CPU
Date: Wed, 13 Dec 2023 19:20:51 +1100
> netbsd-10 panics early on current multi-core Ryzen cpus.
>
> See the boot log for an Epyc 9554P cpu on a Gigabyte R263-Z70
> board at
>
> <ftp://oak.causeuse.org/pub/NetBSD/netbsd-10-GA_R263-Z70_epyc9554p.boot=
log.gz>
>
> and the related discussion on current-users, where Martin
> suggested
>
> "That sounds like an fpu xsave size issue Taylor looked at
> recently (but it is not fixed)."
there are multiple issues with this system, ouch.
no CPUs attach in this dmesg. cpu0 remains half-attached. this
is some problem with the MADT parser i guess (i don't know this
very well.)
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x0)
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x2)
...
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x7e)
...
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x5e)
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x1)
...
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x7f)
...
[ 1.0000040] bogus MADT X2APIC entry (id =3D 0x5f)
=
ie, 128 cpu threads fail to attach (which matches the specs for
epyc 9554p - 64c/128t.) some devices still attach things to
cpu0 for affinity, even though it's in UP mode:
[ 1.0525126] nvme0: for io queue 1 interrupting at msix0 vec 1 affinity =
to cpu0
... plus nvme1/2/3.
some of the dmesg items seem to have 'nul' chars in them:
[ 1.0000040] ACPI: XSDT 0x00000000A4E13728 000^@0DC (v01 GBT BTUACPI 0=
3042021 AMI 01000013)
[ 1.0525126] AMD 19h/1xh RCEC (Root Complex^@ Event Collectosystem) at p=
ci0 dev 0 function 3 not configured
and then the final crash as reported in this PR:
[ 1.0525126] fatal privileged instruction fault in supervisor mode
[ 1.0525126] trap type 0 code 0 rip 0xffffffff8023c24e cs 0x8 rflags 0x1=
0256 cr2 0 ilevel 0x6 rsp 0xffffffff81d3bab8
[ 1.0525126] curlwp 0xffffffff8188ac00 pid 0.0 lowest kstack 0xffffffff8=
1d362c0
kernel: privileged instruction fault trap, code=3D0
Stopped in pid 0.0 (system) at netbsd:xrstor+0xa: fxsavel
xrstor() at netbsd:xrstor+0xa
aes_selftest() at netbsd:aes_selftest+0x26
aes_modcmd() at netbsd:aes_modcmd+0xe9
module_do_builtin() at netbsd:module_do_builtin+0x142
module_do_builtin() at netbsd:module_do_builtin+0xfa
module_init_class() at netbsd:module_init_class+0x142
.mrg.
Home |
Main Index |
Thread Index |
Old Index