tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Support for ramdisks in PVH boot
Hi tech-kern@,
On Mon, 31 Mar 2025 04:40:30 -0000 (UTC), Pierre Pronchery wrote:
> This post is related to iMil's recent work on PVH support for
> NetBSD/amd64.
> I was unable to use his work to boot on ramdisks directly with QEMU's -
> initrd flag, when using -kernel.
>
> Well after a deep dive into it, I think I am almost there:
> https://git.edgebsd.org/gitweb/?
> p=src.git;a=commitdiff;h=629621f41089af50584214a4d32b50ae8ee414f2
>
> This patch:
> - extends sys/arch/amd64/amd64/genassym.cf for additional knowledge of
> Xen's hvm_start_info (notably nr_modules and modlist_paddr)
> - extends .start_genpvh in locore.S to copy the module entries, and
> their
> respective command lines and contents
> - teaches x86_machdep.c to load Xen modules when a VM_GUEST_GENPVH guest
>
> The code is not working yet unfortunately.
Well, now it does; with MICROVM, on an Intel-macOS host:
> $ qemu-system-x86_64 -m 512 -accel hvf -display none -serial stdio \
> -M microvm,rtc=off,acpi=off,pic=off -kernel netbsd-MICROVM -append \
> console=com rw -v -initrd ramdisk-cgdroot.fs -action reboot=shutdown \
> -D qemu.log -d cpu_reset,in_asm,guest_errors,unimp \
> -device virtio-blk-device,drive=hd0 \
> -drive file=ld0.img,format=raw,id=hd0
> qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.80000001H:ECX.svm [bit 2]
> [ 1.0000000] WARNING: system needs entropy for security; see entropy(7)
> [ 1.0000000] [ Kernel symbol table missing! ]
> [ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002,>
2003,
> [ 1.0000000] 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012,
> 2013,
> [ 1.0000000] 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022,
2023,
> [ 1.0000000] 2024, 2025
> [ 1.0000000] The NetBSD Foundation, Inc. All rights reserved.
> [ 1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
> [ 1.0000000] The Regents of the University of California. All
rights reserved.
>
> [ 1.0000000] NetBSD 10.99.12 (MICROVM) #0: Wed Apr 9 08:52:24 UTC 2025
> [ 1.0000000] mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/
compile/MICROVM
> [ 1.0000000] total memory = 511 MB
> [ 1.0000000] avail memory = 480 MB
> [ 1.0000000] KERNBASE=0xffffffff80000000
> [ 1.0000000] modlist_paddr=0xffffffff80a00038 >
cmdline_paddr=0xffffffff80ee2075 cmdline="console=com rw -v
virtio_mmio.device=512@0xfeb00e00:12"
> [ 1.0000000] Xen module info at boot (0xffffffff80a00038, 1)
> [ 1.0000000] timecounter: Timecounters tick every 10.000 msec
> [ 1.0000000] mainbus0 (root)
> [ 1.0000000] mainbus0: Intel MP Specification (Version 1.4) (QBOOT
000000000000)
> [ 1.0000000] cpu0 at mainbus0 apid 0
> [ 1.0000000] cpu0: Use lfence to serialize rdtsc
> [ 1.0000000] cpu0: QEMU Virtual CPU version 2.5+, id 0x60fb1
> [ 1.0000000] cpu0: node 0, package 0, core 0, smt 0
> [ 1.0000000] mpbios: bus 0 is type ISA
> [ 1.0000000] ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 0x20,
24 pins
> [ 1.0000000] isa0 at mainbus0
> [ 1.0000000] com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, 16-byte
FIFO
> [ 1.0000000] com0: console
> [ 1.0000000] allocated pic ioapic0 type edge pin 4 level 8 to cpu0 slot
0 idt entry 129
> [ 1.0000000] pv0 at mainbus0
> [ 1.0000000] virtio0 at pv0
> [ 1.0000000] virtio0: kernel parameters: console=com rw -v
virtio_mmio.device=512@0xfeb00e00:12
> [ 1.0000000] virtio0: viommio: 512@0xfeb00e00:12
> [ 1.0000000] virtio0: VirtIO-MMIO-v1
> [ 1.0000000] virtio0: block device (id 2, rev. 0x00)
> [ 1.0000000] ld0 at virtio0: features:
0x10002e54<INDIRECT_DESC,DISCARD,CONFIG_WCE,TOPOLOGY,FLUSH,BLK_SIZE,GEOMETRY,SEG_MAX>
> [ 1.0000000] ld0: Unknown SIZE_MAX, assuming 65536
> [ 1.0000000] ld0: max 254 segs of max 65536 bytes
> [ 1.0000000] virtio0: allocated 4227072 byte for virtqueue 0 for I/O
request, size 1024
> [ 1.0000000] virtio0: using 4194304 byte (262144 entries) indirect
descriptors
> [ 1.0000000] allocated pic ioapic0 type level pin 12 level 6 to cpu0
slot 1 idt entry 96
> [ 1.0000000] virtio0: interrupting on -1
> [ 1.0000000] ld0: 1953 MB, 3968 cyl, 16 head, 63 sec, 512 bytes/sect x
4000000 sectors
> [ 1.0000000] virtio1 at pv0
> [ 1.0000000] timecounter: Timecounter "lapic" frequency 1046204000 Hz
quality -100
> [ 1.0000000] timecounter: Timecounter "clockinterrupt" frequency 100 Hz
quality 0
> [ 1.0000030] timecounter: Timecounter "TSC" frequency 2410445480 Hz
quality -100
> [ 1.0000030] boot device: ld0
> [ 1.0000030] md0: internal 5000 KB image area
> [ 1.0000030] root on md0a dumps on md0b
> [ 1.0000030] root file system type: ffs
> [ 1.0000030] kern.module.path=/stand/amd64/10.99.12/modules
> [ 1.0100030] WARNING: no TOD clock present
> [ 1.0100030] WARNING: using filesystem time
> [ 1.0100030] WARNING: CHECK AND RESET THE DATE!
> [ 1.0100030] warning: no /dev/console
> Created tmpfs /dev (1835008 byte, 3552 inodes)
> Could not mount the boot partition
> erase ^?, werase ^W, kill ^U, intr ^C
> This image contains utilities which may be needed
> to get you out of a pinch.
> #
Your help in reviewing this work before committing will be very welcome!
The patch:
From caa038822350a7f30a7975dc29386c052dca32de Mon Sep 17 00:00:00 2001
From: Pierre Pronchery <khorben%EdgeBSD.org@localhost>
Date: Mon, 31 Mar 2025 04:36:00 +0200
Subject: [PATCH] amd64: add support for -initrd with VM_GUEST_GENPVH
Tested on NetBSD/amd64
---
sys/arch/amd64/amd64/genassym.cf | 6 +++
sys/arch/amd64/amd64/locore.S | 65 +++++++++++++++++++++++++++++---
sys/arch/amd64/conf/MICROVM | 4 ++
sys/arch/x86/x86/x86_machdep.c | 32 ++++++++++++++++
4 files changed, 102 insertions(+), 5 deletions(-)
diff --git a/sys/arch/amd64/amd64/genassym.cf b/sys/arch/amd64/amd64/
genassym.cf
index d8f31cd51a22..c93c79ffb32c 100644
--- a/sys/arch/amd64/amd64/genassym.cf
+++ b/sys/arch/amd64/amd64/genassym.cf
@@ -384,6 +384,12 @@ define SIR_XENIPL_HIGH SIR_XENIPL_HIGH
define EVTCHN_UPCALL_MASK offsetof(struct vcpu_info,
evtchn_upcall_mask)
define HVM_START_INFO_SIZE sizeof(struct hvm_start_info)
define START_INFO_VERSION offsetof(struct hvm_start_info, version)
+define START_INFO_MODLIST_PADDR offsetof(struct hvm_start_info,
modlist_paddr)
+define START_INFO_NR_MODULES offsetof(struct hvm_start_info, nr_modules)
+define HVM_MODLIST_ENTRY_SIZE sizeof(struct hvm_modlist_entry)
+define MODLIST_ENTRY_CMDLINE offsetof(struct hvm_modlist_entry,
cmdline_paddr)
+define MODLIST_ENTRY_PADDR offsetof(struct hvm_modlist_entry, paddr)
+define MODLIST_ENTRY_SIZE offsetof(struct hvm_modlist_entry, size)
define MMAP_PADDR offsetof(struct hvm_start_info,
memmap_paddr)
define MMAP_ENTRIES offsetof(struct hvm_start_info,
memmap_entries)
define MMAP_ENTRY_SIZE sizeof(struct hvm_memmap_table_entry)
diff --git a/sys/arch/amd64/amd64/locore.S b/sys/arch/amd64/amd64/locore.S
index 6711b572324f..f3db58189b45 100644
--- a/sys/arch/amd64/amd64/locore.S
+++ b/sys/arch/amd64/amd64/locore.S
@@ -1106,10 +1106,60 @@ ENTRY(start_pvh)
shrl $2, %ecx
rep movsl
- /* Copy cmdline_paddr after hvm_start_info */
+ /* Copy hvm_modlist_entry[] after hvm_start_info */
+ movl $RELOC(__kernel_end), %ebx
+ movl START_INFO_MODLIST_PADDR(%ebx), %esi
+ movl %edi, START_INFO_MODLIST_PADDR(%ebx) /* Set new
modlist_paddr in hvm_start_info */
+ movl START_INFO_NR_MODULES(%ebx), %eax /* Get nr_modules */
+ movl $HVM_MODLIST_ENTRY_SIZE, %ecx /* ecx =
sizeof(hvm_modlist_entry) */
+ mull %ecx /* eax * ecx => edx:eax */
+ movl %eax, %ecx
+ shrl $2, %ecx
+ rep movsl
+
+ /* Copy the modules after the hvm_modlist_entry[] */
+ xorl %ecx, %ecx /* ecx = i = 0 */
+ .modlist_copy:
+ movl $RELOC(__kernel_end), %ebx /* ebx = &hvm_start_info */
+ movl START_INFO_NR_MODULES(%ebx), %eax /* eax = nr_modules */
+ cmpl %eax, %ecx /* if (ecx == nr_modules) */
+ je .modlist_copy_done /* goto modlist_copy_done */
+ push %ecx
+ /* Copy the module */
+ movl START_INFO_MODLIST_PADDR(%ebx), %ebx /* ebx =
&hvm_modlist_entry[0] */
+ movl $HVM_MODLIST_ENTRY_SIZE, %eax /* eax =
sizeof(hvm_modlist_entry) */
+ mull %ecx /* eax *= ecx */
+ addl %eax, %ebx /* ebx = &hvm_modlist_entry[i] */
+ /* Copy the module's cmdline */
+ movl MODLIST_ENTRY_CMDLINE(%ebx), %esi
+ xorl %eax, %eax
+ movl %eax, MODLIST_ENTRY_CMDLINE(%ebx)
+ cmpl %eax, %esi
+ je .modlist_cmdline_copy_done
+
+ movl %edi, MODLIST_ENTRY_CMDLINE(%ebx) /* Set new cmdline_paddr in
hvm_modlist_entry */
+ .modlist_cmdline_copy:
+ movb (%esi), %al
+ movsb
+ cmp $0, %al
+ jne .modlist_cmdline_copy
+ .modlist_cmdline_copy_done:
+
+ /* Copy the module's content */
+ movl MODLIST_ENTRY_PADDR(%ebx), %esi /* esi =
hvm_modlist_entry[i].paddr */
+ movl %edi, MODLIST_ENTRY_PADDR(%ebx) /* Set new paddr in
hvm_modlist_entry */
+ movl MODLIST_ENTRY_SIZE(%ebx), %ecx /* ecx =
hvm_modlist_entry[i].size */
+ rep movsb
+
+ pop %ecx /* i++ */
+ inc %ecx
+ jmp .modlist_copy
+ .modlist_copy_done:
+
+ /* Copy cmdline_paddr after the modules */
+ movl $RELOC(__kernel_end), %ebx
movl CMDLINE_PADDR(%ebx), %esi
- movl $RELOC(__kernel_end), %ecx
- movl %edi, CMDLINE_PADDR(%ecx) /* Set new cmdline_paddr in
hvm_start_info */
+ movl %edi, CMDLINE_PADDR(%ebx) /* Set new cmdline_paddr in
hvm_start_info */
.cmdline_copy:
movb (%esi), %al
movsb
@@ -1136,11 +1186,17 @@ ENTRY(start_pvh)
/* announce ourself */
movl $VM_GUEST_GENPVH, RELOC(vm_guest)
+ /* determine the amount of data needed */
+ movl %edi, %edx
+ subl $RELOC(__kernel_end), %edx
+
jmp .save_hvm_start_paddr
.start_xen32:
pop %ebx
movl $VM_GUEST_XENPVH, RELOC(vm_guest)
+ /* XXX assume hvm_start_info+dependant structure fits in a single
page */
+ movl $PAGE_SIZE, %edx
.save_hvm_start_paddr:
/*
@@ -1166,9 +1222,8 @@ ENTRY(start_pvh)
movl $RELOC(HYPERVISOR_shared_info_pa),%ebp
movl %ebx,(%ebp)
movl $0,4(%ebp)
- /* XXX assume hvm_start_info+dependant structure fits in a single
page */
.add_hvm_start_info_page:
- addl $PAGE_SIZE, %ebx
+ addl %edx, %ebx
addl $PGOFSET,%ebx
andl $~PGOFSET,%ebx
addl $KERNBASE_LO,%ebx
diff --git a/sys/arch/amd64/conf/MICROVM b/sys/arch/amd64/conf/MICROVM
index 65982d42b4a9..864002a5eb25 100644
--- a/sys/arch/amd64/conf/MICROVM
+++ b/sys/arch/amd64/conf/MICROVM
@@ -23,3 +23,7 @@ machine amd64 x86 xen
include "arch/x86/conf/MICROVM.common"
options EXEC_ELF64 # exec ELF binaries
+options MODULAR # new style module(7) framework
+
+options MEMORY_DISK_HOOKS # enable md specific hooks
+options MEMORY_DISK_DYNAMIC # enable dynamic resizing
diff --git a/sys/arch/x86/x86/x86_machdep.c b/sys/arch/x86/x86/
x86_machdep.c
index ab5ffaf35410..7f3d2308ba46 100644
--- a/sys/arch/x86/x86/x86_machdep.c
+++ b/sys/arch/x86/x86/x86_machdep.c
@@ -215,6 +215,32 @@ mm_md_physacc(paddr_t pa, vm_prot_t prot)
}
#ifdef MODULAR
+#ifdef XEN
+void x86_add_xen_modules(void);
+void x86_add_xen_modules(void)
+{
+ uint32_t i;
+#if defined(MEMORY_DISK_HOOKS) && defined(MEMORY_DISK_DYNAMIC)
+ struct hvm_modlist_entry *modlist;
+#endif
+
+ if (hvm_start_info->nr_modules == 0) {
+ aprint_verbose("No Xen module info at boot\n");
+ return;
+ }
+#if defined(MEMORY_DISK_HOOKS) && defined(MEMORY_DISK_DYNAMIC)
+ modlist = (void *)((uintptr_t)hvm_start_info->modlist_paddr +
KERNBASE);
+#endif
+ for (i = 0; i < hvm_start_info->nr_modules; i++) {
+ /* XXX can be a filesystem image or ELF module or
splashscreen */
+#if defined(MEMORY_DISK_HOOKS) && defined(MEMORY_DISK_DYNAMIC)
+ md_root_setconf(
+ (void *)((uintptr_t)modlist[i].paddr + KERNBASE),
+ modlist[i].size);
+#endif
+ }
+}
+#endif
/*
* Push any modules loaded by the boot loader.
*/
@@ -224,6 +250,12 @@ module_init_md(void)
struct btinfo_modulelist *biml;
struct bi_modulelist_entry *bi, *bimax;
+#ifdef XEN
+ if (vm_guest_is_pvh()) {
+ x86_add_xen_modules();
+ }
+#endif /* XEN */
+
biml = lookup_bootinfo(BTINFO_MODULELIST);
if (biml == NULL) {
aprint_debug("No module info at boot\n");
--
2.48.1
Cheers & HTH,
--
khorben
Home |
Main Index |
Thread Index |
Old Index