I think I've found why Xen domUs can't mount some file-backed disk images! This realization must have come to my unconscious as I was sleeping, since just as I awoke I realised what must be happening. The clue I needed was from back in the early March discussion with Michael van Elst about "problems with GPT (and maybe dkctl wedges) on LVM volumes" where he sais "The LVM volume is not a disk", and then my realization that a vnd(4) interprets the file AS A DISK, and so this relies on the "whole drive" partition really being the whole drive, and maybe it's not. I don't know if this is just a "new" problem or not -- but it is certainly a real problem. So far I've only tested on machines running a relatively recent -current kernel (from approx 2021-03-10 sources). For example, when I try to export the FreeBSD mini-memstick.img file to a domU (with the following "disk" spec) that boots a recent FreeBSD HPV kernel I get: type = "pvh" name = "fbsd-test" memory = 2000 maxmem = 8000 vcpus = 4 vif = [ 'bridge=bridge0' ] kernel = "/images/freebsd-12.2-kernel" cmdline = 'vfs.root.mountfrom=ufs:ufs/FreeBSD_Install,vfs.root.mountfrom.options=ro,boot_verbose=1' disk = [ 'format=raw, vdev=hda, access=ro, target=/images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img', ] xbd0: 386MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0 xbd0: attaching as ada0 xbd0: features: flush xbd0: synchronize cache commands enabled. GEOM: new disk ada0 xn0: backend features: feature-sg Trying to mount root from ufs:ufs/FreeBSD_Install [ro]... GEOM_PART: partition 2 has end offset beyond last LBA: 791120 > 790527 GEOM_PART: integrity check failed (ada0, MBR) mountroot: waiting for device ufs/FreeBSD_Install... Mounting from ufs:ufs/FreeBSD_Install failed with error 19. Loader variables: vfs.root.mountfrom=ufs:ufs/FreeBSD_Install vfs.root.mountfrom.options=ro Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot> random: unblocking device. arc4random: no preloaded entropy cache ? List of GEOM managed disk devices: ada0 mountroot> However if I copy that exact same mini-memstic.img file to an LVM volume and then export that (here as "sdb" in the following), I get success: type = "pvh" name = "fbsd-test" memory = 2000 maxmem = 8000 vcpus = 4 vif = [ 'bridge=bridge0' ] kernel = "/images/freebsd-12.2-kernel" cmdline = 'vfs.root.mountfrom=ufs:ufs/FreeBSD_Install,vfs.root.mountfrom.options=ro,boot_verbose=1' disk = [ # vg0-fbsd--test.1 has mini-memstick.img copied to it 'format=raw, vdev=sda, access=ro, target=/dev/mapper/vg1-fbsd--test.0', # this is a blank LVM LV 'format=raw, vdev=sdb, access=rw, target=/dev/mapper/vg0-fbsd--test.1', # this is a 4gb file of zeros: 'format=raw, vdev=sdc, access=rw, target=/images/fbsd-test.2', ] xbd0: 40960MB <Virtual Block Device> at device/vbd/2048 on xenbusb_front0 xbd0: attaching as da0 xbd0: features: flush arc4random: no preloaded entropy cache xn0: bpf attached xn0: Ethernet address: 00:16:3e:2d:b0:d2 xbd0: synchronize cache commands enabled. GEOM: new disk da0 xenbusb_back0: <Xen Backend Devices> on xenstore0 xenballoon0: <Xen Balloon Device> on xenstore0 xbd1: 30720MB <Virtual Block Device> at device/vbd/2064 on xenbusb_front0 xbd1: attaching as da1 xbd1: features: flush xbd1: synchronize cache commands enabled. xbd2: 4096MB <Virtual Block Device> at device/vbd/2080 on xenbusb_front0 xbd2: attaching as da2 xbd2: features: flush xbd2: synchronize cache commands enabled. xn0: backend features: feature-sg Trying to mount root from ufs:ufs/FreeBSD_Install [ro]... GEOM: new disk da1 GEOM: new disk da2 xen_et0: providing initial system time start_init: trying /sbin/init arc4random: no preloaded entropy cache Starting file system checks: /dev/ufs/FreeBSD_Install: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ufs/FreeBSD_Install: clean, 5744 free (128 frags, 702 blocks, 0.1% fragmentation) So, the problem appears to be that the /dev/vndXd partition isn't making the whole file visible fully transparently. The way Xen(tools) makes a file available to the domU is limited by the fact that xbdback(4) can only interface with block devices, and as such Xen(tools) uses a script to interpose a vnd(4) device over a file and make it look like a block device. Now this script tells xbdback(4) to open the "d" partition, which in theory should present the whole file as a raw block device. However it is not doing so, critically for the first few blocks. /dev/vnd0d is all zeros for the first 8192 bytes, but the original image is not: # fdisk -F /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img Disk: /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img NetBSD disklabel disk geometry: cylinders: 49, heads: 255, sectors/track: 63 (16065 sectors/cylinder) total sectors: 791121, bytes/sector: 512 BIOS disk geometry: cylinders: 49, heads: 255, sectors/track: 63 (16065 sectors/cylinder) total sectors: 791121 Partitions aligned to 16065 sector boundaries, offset 63 Partition table: 0: EFI system partition (sysid 239) start 1, size 1600 (1 MB, Cyls 0/0/2-0/25/26) 1: FreeBSD or 386BSD or old NetBSD (sysid 165) start 1601, size 789520 (386 MB, Cyls 0/25/27-49/62/30), Active 2: <UNUSED> 3: <UNUSED> First active partition: 1 Drive serial number: 2425393296 (0x90909090) # fdisk vnd0 fdisk: primary partition table invalid, no magic in sector 0 fdisk: Cannot determine the number of heads Disk: /dev/rvnd0d NetBSD disklabel disk geometry: cylinders: 4096, heads: 64, sectors/track: 32 (2048 sectors/cylinder) total sectors: 8388608, bytes/sector: 512 BIOS disk geometry: cylinders: 522, heads: 255, sectors/track: 63 (16065 sectors/cylinder) total sectors: 8388608 Partitions aligned to 16065 sector boundaries, offset 63 Partition table: 0: <UNUSED> 1: <UNUSED> 2: <UNUSED> 3: <UNUSED> Bootselector disabled. No active partition. Drive serial number: 0 (0x00000000) # disklabel vnd0 # /dev/rvnd0d: type: vnd disk: vnd label: fictitious flags: bytes/sector: 512 sectors/track: 32 tracks/cylinder: 64 sectors/cylinder: 2048 cylinders: 4096 total sectors: 8388608 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 4 partitions: # size offset fstype [fsize bsize cpg/sgs] a: 8388608 0 4.2BSD 0 0 0 # (Cyl. 0 - 4095) d: 8388608 0 unused 0 0 # (Cyl. 0 - 4095) disklabel: boot block size 0 disklabel: super block size 0 # cmp /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img /dev/rvnd0d /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img /dev/rvnd0d differ: char 1, line 1 # cmp /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img /dev/mapper/rvg1-fbsd--test.0 cmp: EOF on /images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img (which just means the LVM LV is bigger than the IMG, but they both were the same through the whole length of the IMG) # dd if=/images/FreeBSD-12.2-RELEASE-amd64-mini-memstick.img count=2 msgfmt=quiet | od -c 0000000 374 1 300 216 300 216 330 216 320 274 \0 | 276 032 | 277 0000020 032 006 271 346 001 363 244 351 \0 212 1 366 273 276 007 261 0000040 004 8 / t \b 177 u 205 366 u q 211 336 200 303 020 0000060 342 357 205 366 u 002 315 030 200 372 200 r 013 212 6 u 0000100 004 200 306 200 8 362 r 002 212 024 211 347 212 t 001 213 0000120 L 002 273 \0 | 366 006 275 007 200 t - Q S 273 252 0000140 U 264 A 315 023 r 201 373 U 252 u 032 366 301 001 0000160 t 025 [ f j \0 f 377 t \b 006 S j 001 j 020 0000200 211 346 270 \0 B 353 005 [ Y 270 001 002 315 023 211 374 0000220 r 017 201 277 376 001 U 252 u \f 377 343 276 271 006 353 0000240 021 276 321 006 353 \f 276 360 006 353 007 273 007 \0 264 016 0000260 315 020 254 204 300 u 364 353 376 I n v a l i d 0000300 p a r t i t i o n t a b l e 0000320 \0 E r r o r l o a d i n g o 0000340 p e r a t i n g s y s t e m \0 0000360 M i s s i n g o p e r a t i n 0000400 g s y s t e m \0 220 220 220 220 220 220 220 0000420 220 220 220 220 220 220 220 220 220 220 220 220 220 220 220 220 * 0000660 220 220 220 220 220 220 220 220 220 220 220 220 220 200 \0 377 0000700 377 377 357 377 377 377 001 \0 \0 \0 @ 006 \0 \0 200 377 0000720 377 377 245 377 377 377 A 006 \0 \0 020 \f \f \0 \0 \0 0000740 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000760 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 U 252 0001000 353 < 220 B S D 4 . 4 \0 002 001 001 \0 0001020 002 \0 002 @ 006 360 005 \0 ? \0 001 \0 \0 \0 \0 \0 0001040 \0 \0 \0 \0 \0 \0 ) 356 021 A 275 E F I S Y 0001060 S F A T 1 2 372 1 0001100 300 216 320 274 \0 | 373 216 330 350 \0 \0 ^ 203 306 031 0001120 273 007 \0 374 254 204 300 t 006 264 016 315 020 353 365 0 0001140 344 315 026 315 031 \r \n N o n - s y s t e 0001160 m d i s k \r \n P r e s s a n 0001200 y k e y t o r e b o o t \r 0001220 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0001240 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001760 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 U 252 0002000 # dd if=/dev/rvnd0d count=17 msgfmt=quiet| od -c 0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0020000 \0 \0 \0 \0 \0 \0 \0 \0 \b \0 \0 \0 020 \0 \0 \0 0020020 030 \0 \0 \0 230 005 \0 \0 \0 \0 \0 \0 377 377 377 377 0020040 367 360 p ` \0 \0 \0 007 200 037 \0 027 \0 \0 \0 0020060 \0 @ \0 \0 \0 \b \0 \0 \b \0 \0 \0 005 \0 \0 \0 0020100 \0 \0 \0 \0 < \0 \0 \0 \0 300 377 377 \0 370 377 377 0020120 016 \0 \0 \0 013 \0 \0 \0 004 \0 \0 \0 \0 020 \0 \0 0020140 003 \0 \0 \0 002 \0 \0 \0 \0 \b \0 \0 \0 \0 \0 \0 0020160 \0 \0 \0 \0 \0 020 \0 \0 200 \0 \0 \0 004 \0 \0 \0 0020200 \0 \0 \0 \0 300 220 005 \0 001 \0 \0 \0 \0 \0 \0 \0 0020220 367 360 p ` _ ` A q 230 005 \0 \0 \0 \b \0 \0 0020240 \0 @ \0 \0 \0 \0 \0 \0 300 220 005 \0 300 220 005 \0 0020260 027 \0 \0 \0 001 \0 \0 \0 \0 X \0 \0 0 d 001 \0 0020300 001 \0 \0 \0 377 357 003 \0 375 347 007 \0 016 \0 \0 \0 0020320 \0 001 \0 200 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0020340 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0021000 In fact the vnd0d device seems to give garbage forever -- it seems to have been completely confused by trying to access a real disk image! As a side note unfortunately even though access to this LVM-backed mini-memstick.img file now seems OK enough to get the install booted and a shell running, access to other FreeBSD xbd(4) devices is still not working from FreeBSD (i.e. a fresh newfs'ed FS appears corrupt to an immediate fsck, without mounting, and even fsck of the mounted root in this IMG fails enormously). # df Filesystem 512-blocks Used Avail Capacity Mounted on /dev/ufs/FreeBSD_Install 782968 737016 -16680 102% / devfs 2 2 0 100% /dev tmpfs 65536 232 65304 0% /var tmpfs 40960 8 40952 0% /tmp # fsck /dev/ufs/FreeBSD_Install ** /dev/ufs/FreeBSD_Install SAVE DATA TO FIND ALTERNATE SUPERBLOCKS? [yn] n ADD CYLINDER GROUP CHECK-HASH PROTECTION? [yn] n ** Last Mounted on ** Root file system ** Phase 1 - Check Blocks and Sizes PARTIALLY TRUNCATED INODE I=28 SALVAGE? [yn] n PARTIALLY TRUNCATED INODE I=112 SALVAGE? [yn] ^Cda0: disk error cmd=write 8145-8152 status: fffffffe # ***** FILE SYSTEM MARKED DIRTY ***** # -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpWxPxVqt0SY.pgp
Description: OpenPGP Digital Signature