Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
extending LVM logical volumes for Xen root partitions is NOT so simple!
TL/DR: lvresize on Xen domU root volumes requires re-writing the disk label!
So, during an upgrade of my main build server, which is a Xen domU with
LVM backed filesystems, I decided to increase the size of its root
filesystem. I had tested this many months ago and all went well, albeit
the test was with a non-root filesystem on a throw-away domU. I had
done an fsck and the resize_ffs from the dom0 in both the test and the
upgrade.
However the first boot quite surprisingly dropped into single-user mode:
Mon Feb 11 01:31:05 PST 2019
Starting root file system check:
CANNOT READ: BLK 41569216
/dev/rxbd0a: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Automatic file system check failed; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
[1] Terminated rc_real_work "${@}" 2>&1 |
Done(1) rc_postprocess
Enter pathname of shell or RETURN for /bin/sh:
The more verbose output from a manual attempt gives more clues:
future# fsck -n /dev/rxbd0a
** /dev/rxbd0a (NO WRITE)
CANNOT READ: BLK 41569216
CONTINUE? yes
THE FOLLOWING DISK SECTORS COULD NOT BE READ: 41569216, 41569217, 41569218, 41569219,
/dev/rxbd0a: CANNOT FIGURE OUT SECTORS PER CYLINDER
Basically I had doubled the size, and now it seems none of superblocks
fsck wants to read can be read (from the domU).
Yesterday I finally figured out that it must be due to one kernel or
another (likely domU) believing the original disklabel that I'm guessing
was written by sysinst during the first install of the system.
My test of expanding an LVM-backed filesystem had been on a non-sysinst
created filesystem, and I have not been putting any labels on any of the
filesystem devices added after boot, so without a label on disk nothing
restricts access to the whole logical volume and all is well in the
domU after the filesystem has been resized to match the LV.
What's confusing is that despite the disklabel appearing in the dom0,
and appearing identical to how it appears in the domU, the dom0 system
completely ignores it and just gets on with things. So, I'm not sure if
it is the domU kernel, or the dom0 device mapper or backend device, or
something else, which is restricting reads to the original device size,
but given the dom0 has full access to the whole resized logical volume,
it is likely to be the domU driver that has read the on-disk label and
used it.
And thus I'm hoping that it is the on-disk disklabel that is setting
this limit (since I can't find any other source of the old size still in
the dom0). (I haven't tried looking at the related kernel code -- the
last time I read that code I had too many urges to "fix" it! :-))
The disklabel (on disk, as seen from the dom0) is as follows:
xentastic# disklabel -r /dev/mapper/rvg0-lv20
# /dev/mapper/rvg0-lv20:
type: unknown
disk: future root00
label:
flags:
bytes/sector: 512
sectors/track: 2048
tracks/cylinder: 1
sectors/cylinder: 2048
cylinders: 10240
total sectors: 20971520
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
16 partitions:
# size offset fstype [fsize bsize cpg/sgs]
a: 20971520 0 4.2BSD 2048 16384 0 # (Cyl. 0 - 10239)
c: 20971520 0 unused 0 0 # (Cyl. 0 - 10239)
d: 20971520 0 unused 0 0 # (Cyl. 0 - 10239)
As you can see it is at 10GB, while the new logical volume is at 21GB:
xentastic# lvm lvdisplay -v vg0/lv20
Using logical volume(s) on command line
--- Logical volume ---
LV Name /dev/vg0/lv20
VG Name vg0
LV UUID xguHwv-fP4f-2cO0-SLzy-pNye-IAD6-H01gsB
LV Write Access read/write
LV Status available
# open 0
LV Size 21.00 GiB
Current LE 5376
Segments 2
Allocation inherit
Read ahead sectors auto
- currently set to 0
Block device 169:8
xentastic# dmsetup -v status /dev/mapper/vg0-lv20
Name: vg0-lv20
State: ACTIVE
Read Ahead: 0
Tables present: LIVE
Open count: 0
Event number: 0
Major, minor: 169, 8
Number of targets: 2
0 20971520 linear
20971520 23068672 linear
My swap partition has a label saved to disk too, though I'm not sure how
that happened, and it appears to be a copy of the fictitious label:
future# disklabel -r xbd1
# /dev/rxbd1d:
type: ESDI
disk: Xen Virtual ESDI
label: fictitious
flags:
bytes/sector: 512
sectors/track: 2048
tracks/cylinder: 1
sectors/cylinder: 2048
cylinders: 16384
total sectors: 33554432
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
4 partitions:
# size offset fstype [fsize bsize cpg/sgs]
a: 33554432 0 swap # (Cyl. 0 - 16383)
d: 33554432 0 unused 0 0 # (Cyl. 0 - 16383)
Given the current and normally low rate of change on my system's root
filesystem (it has a separate /var, etc.), and the fact it fscks fine
from the dom0, I've gone ahead and brought the domU system up multi-user
without any problem. However I want to do another upgrade on it, and
that'll churn the filesystem, so I want to fix this issue.
My instinct is to just replace the label with a block of zeros (after
making a backup of it in the dom0 of course). However I'm unsure if
any tools might be assuming a root disklabel exists (e.g. does sysinst
make use of it for an upgrade?).
Perhaps though the best option is to re-write the label with a newly
minted copy of what would be the fictitious label. (I think I could do
that by blanking the label, then using disklabel to read the fictitious
label from the driver and then write it to the disk.)
I'd try one of those right now, but I'm typing this message on that
system.
Ideally I would like to see the OS handle all these issues automatically
somehow, though it wouldn't be the end of the world if there were
another step required to resize a root filesystem.
However if another step is going to always be needed then it should not
require the admin to calculate or even copy any value to facilitate it.
I.e. one should not have to copy the value for the new size into the
disklabel manually -- a new option disklabel(8), or some new tool,
should do that automatically (i.e. if there's only one non-whole
partition on the disk, and it's the same size as the whole-disk
partition, then the size of both partitions should be adjusted to match
the new size of the logical volume). Perhaps there could be a new
option to resize_ffs to have it call disklabel to fix the label, thus
reducing the required number of actions for the admin and making it very
easy to set up boot-time scripts for cloud hosting that would
automatically resize (or at least grow) all filesystems based on the
current size of their backing stores.
--
Greg A. Woods <gwoods%acm.org@localhost>
+1 250 762-7675 RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Home |
Main Index |
Thread Index |
Old Index