Port-amd64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: [PATCH] Handling large XSAVE space (Intel AMX)
> Date: Tue, 15 Apr 2025 07:51:47 +0000
> From: Emmanuel Dreyfus <manu%netbsd.org@localhost>
>
> On Fri, Apr 11, 2025 at 10:53:05PM +0000, Taylor R Campbell wrote:
> > tl;dr -- Handle machines with large XSAVE size (Intel AMX) by
> > allocating separate pages for XSAVE instead of trying to cram them
> > into the struct pcb page. OK to commit? Any testers? Share output
> > of `cpuctl identify 0' if you test?
>
> On i386, I get the expected panic
> "NetBSD/i386 does not support fpu save size 11008"
Can you try this patch on top?
This patch will disable TILECFG/TILEDATA before determining the fpu
save size, and will pick the fpu save size based on what's enabled
rather than what's supported by the hardware.
# HG changeset patch
# User Taylor R Campbell <riastradh%NetBSD.org@localhost>
# Date 1744749568 0
# Tue Apr 15 20:39:28 2025 +0000
# Branch trunk
# Node ID 032106d3050573ee4744fdaddeaccaaf8858fa94
# Parent 6d58f73ef4eb15c33bfad97616841352f3a85adc
# EXP-Topic riastradh-pr57661-x86savefpuoverflow
x86: Reserve space for only the extended CPU state we will use.
CPUID[EAX=0x0d].ECX, i.e., descs[2] after x86_cpuid2(0x0d, 0, descs),
gives the size in bytes of the extended CPU state for all features
supported by the hardware in CPUID[EAX=0x0d].EAX which can be enabled
in XCR0. However, on i386, it is senseless to leave TILECFG and
TILEDATA enabled, because they are state for Intel AMX instructions
which work only in 64-bit mode.
So, instead of querying the hardware's supported features and maximum
_supported_ extended CPU state size:
1. Query the hardware's supported features.
2. Enable only those supported by software as well (XCR0_FPU).
3. Query the hardware's maximum _enabled_ extended CPU state size.
We will also disable TILECFG and TILEDATA on amd64 for now too
because:
(a) This is not a regression, at least for TILEDATA (and I'm not sure
any machines ship with TILECFG but not TILEDATA), because the
size overflowed the PCB page and therefore never worked on amd64
(PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu).
(b) We need a little extra work to properly support reading and
writing a process's TILECFG and TILEDATA in ptrace(2), and that
work hasn't been done yet.
While here, write out x86_cpuid2(0x0d, <ecx>, ...) explicitly, rather
than x86_cpuid(0x0d, ...), to make it clear that ECX must be set --
otherwise we may get garbage. (It is, perhaps, an accident that
x86_cpuid(<eax>, ...) always sets ECX=0, but other CPUID access
paths, like gcc's <cpuid.h> __cpuid(<eax>, ...), do not, so let's
make it clear for the reader.)
XXX When we enable TILECFG and TILEDATA in amd64, we should arrange
to disable them in compat32 processes -- no sense in allocating extra
space for state they can't use anyway, since the Intel AMX
instructions work only in 64-bit mode.
PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
diff -r 6d58f73ef4eb -r 032106d30505 sys/arch/x86/x86/identcpu.c
--- a/sys/arch/x86/x86/identcpu.c Tue Apr 15 20:02:22 2025 +0000
+++ b/sys/arch/x86/x86/identcpu.c Tue Apr 15 20:39:28 2025 +0000
@@ -805,21 +805,44 @@ cpu_probe_fpu(struct cpu_info *ci)
x86_fpu_save = FPU_SAVE_XSAVE;
- x86_cpuid2(0xd, 1, descs);
+ x86_cpuid2(0x0d, 1, descs);
if (descs[0] & CPUID_PES1_XSAVEOPT)
x86_fpu_save = FPU_SAVE_XSAVEOPT;
- /* Get features and maximum size of the save area */
- x86_cpuid(0xd, descs);
- if (descs[2] > sizeof(struct fxsave))
- x86_fpu_save_size = descs[2];
+ /*
+ * Get the hardware-supported features with CPUID.
+ */
+ x86_cpuid2(0x0d, 0, descs);
+ x86_xsave_features = (uint64_t)descs[3] << 32 | descs[0];
- x86_xsave_features = (uint64_t)descs[3] << 32 | descs[0];
+ /*
+ * Turn on XSAVE in CR4 so we can write to XCR0, and write to
+ * XCR0 enable only those features that NetBSD software
+ * supports.
+ *
+ * CR4_OSXSAVE support and and XCR0 access are both allowed
+ * because we tested ci->ci_feat_val[1] & CPUID2_XSAVE above.
+ *
+ * (This is redundant with cpu_init when it runs on the primary
+ * CPU, but it's harmless.)
+ */
+ lcr4(rcr4() | CR4_OSXSAVE);
+ wrxcr(0, x86_xsave_features & XCR0_FPU);
+
+ /*
+ * Get the size of the save area with those features enabled
+ * with the second CPUID.
+ *
+ * (Let's hope the features don't change!)
+ */
+ x86_cpuid2(0x0d, 0, descs);
+ if (descs[1] > x86_fpu_save_size)
+ x86_fpu_save_size = descs[1];
/* Get component offsets and sizes for the save area */
for (i = XSAVE_YMM_Hi128; i < __arraycount(x86_xsave_offsets); i++) {
if (x86_xsave_features & __BIT(i)) {
- x86_cpuid2(0xd, i, descs);
+ x86_cpuid2(0x0d, i, descs);
x86_xsave_offsets[i] = descs[1];
x86_xsave_sizes[i] = descs[0];
}
Home |
Main Index |
Thread Index |
Old Index