NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

PR/57258 CVS commit: src/sys/arch



The following reply was made to PR kern/57258; it has been noted by GNATS.

From: "Taylor R Campbell" <riastradh%netbsd.org@localhost>
To: gnats-bugs%gnats.NetBSD.org@localhost
Cc: 
Subject: PR/57258 CVS commit: src/sys/arch
Date: Thu, 24 Apr 2025 01:50:40 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Thu Apr 24 01:50:39 UTC 2025
 
 Modified Files:
 	src/sys/arch/amd64/amd64: genassym.cf
 	src/sys/arch/amd64/include: pcb.h
 	src/sys/arch/i386/i386: genassym.cf
 	src/sys/arch/i386/include: pcb.h
 	src/sys/arch/x86/include: cpu.h cpu_extended_state.h
 	src/sys/arch/x86/x86: fpu.c vm_machdep.c
 
 Log Message:
 amd64: Allocate FPU save state outside pcb if it's too large.
 
 We have seen x86_fpu_save_size values (CPUID[EAX=0x0d, ECX=0].ECX) as
 large as 11008 bytes, notably with Intel AMX TILEDATA's 8192-byte
 state.
 
 We only do this for user threads, and only on machines where it's
 necessary, to avoid incurring much overhead.  There is still a tiny
 bit of overhead when saving and restoring the FPU state by using a
 pointer indirection instead of arithmetic indirection for access to
 struct pcb::pcb_savefpu, but this is probably a drop in the bucket
 compared to the memory traffic incurred by the FPU state save/restore
 anyway.
 
 For now, these paths are mostly disabled on i386.  We could enable
 them but it will require either rewriting cpu_uarea_alloc/free for
 i386, or adopting a guard page like amd64 does, which might be costly
 and so should be undertaken only with some thought and care.  And
 since Intel AMX instructions only work in 64-bit mode, it's not
 likely to be useful on i386.
 
 PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
 KVM/Qemu
 
 These changes, as a side effect, may fix:
 
 PR kern/57258: kthread_fpu_enter/exit problem
 
 by making sure to allocate an FPU save space that is large enough to
 guarantee fpu_kern_enter/leave work safely, instead of just using a
 union savefpu object on the stack (which, at 576 bytes, may be too
 small on some machines, particularly with AVX512 requiring ~2.5K).
 (But we'll have to do some extra work with kthread_fpu_enter/exit_md
 -- if we try doing them again on x86 -- to actually allocate the
 separate pcb on these machines!)
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.98 -r1.99 src/sys/arch/amd64/amd64/genassym.cf
 cvs rdiff -u -r1.32 -r1.33 src/sys/arch/amd64/include/pcb.h
 cvs rdiff -u -r1.136 -r1.137 src/sys/arch/i386/i386/genassym.cf
 cvs rdiff -u -r1.59 -r1.60 src/sys/arch/i386/include/pcb.h
 cvs rdiff -u -r1.139 -r1.140 src/sys/arch/x86/include/cpu.h
 cvs rdiff -u -r1.18 -r1.19 src/sys/arch/x86/include/cpu_extended_state.h
 cvs rdiff -u -r1.89 -r1.90 src/sys/arch/x86/x86/fpu.c
 cvs rdiff -u -r1.46 -r1.47 src/sys/arch/x86/x86/vm_machdep.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.
 


Home | Main Index | Thread Index | Old Index