I had the occasion to reboot one of my shiny new Xen servers today for the first time in a month and I found that it failed to boot because of the appearance since the previous successful boot of a new dk(4) attachment created for a GPT partition on another drive. boot device: dk0 root on dk0 Supported file systems: union umap tmpfs smbfs puffs ptyfs procfs overlay null ntfs nfs msdos mfs lfs kern cd9660 no file system for dk0 (dev 0xa800) cannot mount root, error = 79 root device (default dk0): The problem here is that the system boots from sd0 and root is on sd0a!!! Worse yet, dk0 is not even on sd0, it's a wedge on sd1: sd1 at scsibus1 target 1 lun 0: <DELL, PERC 6/i, 1.11> disk fixed sd1: fabricating a geometry sd1: 1861 GB, 1905664 cyl, 64 head, 32 sec, 512 bytes/sect x 3902799872 sectors sd1: fabricating a geometry sd1: GPT GUID: e171fce5-0937-49de-ab2a-399ac308a695 dk0 at sd1: percraid0 dk0: 3902795776 blocks at 2048, type: The server is running a recent-ish NetBSD 7.99.5 XEN3_DOM0 kernel (from Feb. 20), under Xen-4.5. I used the following commands to put a GPT label on sd1 and make a wedge there for the dk0 device that I then use for LVM: dd if=/dev/zero of=/dev/rsd1d bs=8k count=1 gpt create sd1 gpt add -a 512k -l percraid0 sd1 dkctl sd1 makewedges As far as I know this should not make the wedge appear bootable, and I would not expect the kernel to treat this wedge as special in any way -- i.e. especially not to override the boot device specified by the loader. # dkctl sd1 listwedges /dev/rsd1d: 1 wedge: dk0: percraid0, 3902795776 blocks at 2048, type: Note the wedge "type" is blank. The manual doesn't seem to list a wedge type that would be valid for LVM use, though maybe ccd or swap or unused would suffice, but except for this boot problem it works with no type. I didn't do anything special to not select a type -- just the "makewedges". I'm able to work around this with a "bootdev=sd0" in /boot.cfg, but that doesn't seem like the right way, and I don't think it should be necessary. Google searches suggest I'm not the only person who has been tripped up by this issue. Am I missing something here that I could do to change the wedge configuration to avoid this issue? Is it still so difficult to discover which device the boot loader booted the kernel from on such a semi-modern amd64 machine that the kernel can make such mistakes as this? If dk(4) is auto-configuring can it not at least look to see if there's a valid filesystem on the device before it shoves itself in the front of the line as the supposed "boot device"? Should there be a wedge "type" for LVM? -- Greg A. Woods Planix, Inc. <woods%planix.com@localhost> +1 250 762-7675 http://www.planix.com/ n
Attachment:
pgpeE7gwBlr0V.pgp
Description: PGP signature