NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

musing about performance of various methods of copying (local) filesystems....



So, for a long time now I've guessed that the most performant way to
copy a whole filesystem from one device to another, using basic system
tools, would be:

	cd /migrate/target
	dump -0 -a -b 64 -r 512 -f - /dev/rxbd5a | restore -r -f -

  (the 64 is based on BIG_PIPE_SIZE from <sys/pipe.h>,
  i.e. (BIG_PIPE_SIZE / TP_BSIZE) (from <protocols/dumprestore.h))

However I've found this may not be the fastest, though arguably it might
be the safest way to make a complete copy with all permissions, etc.

While recently copying a filesystem full of source trees and such
collections of small files I found it didn't go much faster than about 9
MB/s, yet while watching "systat vm" as it ran I saw bursts of well over
100 MB/s to/from both devices involved.

The target filesystem was mounted with '-o log'.

I'm guessing the bottleneck here is due to the large number of file
creations done by the "restore" process.

Would mounting the target filesystem with '-o async' have helped?

I wonder too if running multiple processes (e.g. pax -rw's), especially
on the writing side, might help -- I see more consistent and longer
periods of high rates of writing when doing parallel builds, e.g. when
compiling large libraries with many small source files and more jobs
than CPUs.

Note also that these copies were done within a domU on a Xen server,
with the underlying storage being managed by LVM in the dom0, the
underlying storage itself being a RAID array using Dell PERC hardware.
A couple of the filesystems I recently moved were relatively large, and
the target device for them is RAID-10, so should have had decently fast
write speed.  Note also I would have done the copies in the dom0 for
direct access to the hardware, skipping the xbd(4) layer, but in this
case I'm also experimenting with the pros and cons of using GPT labels
inside the LVM devices so that I can use wedges in the domUs and the
"NAME=foo" style of identifying the devices (this avoids problems with
changes in the order or number of entries in the 'disk' specification
for Xen screwing up the xbd(4) numbering in the domUs because the "vdev="
parameter is basically useless; but on the other hand it makes the
filesystem in the LV inaccessible from the dom0 because the device
mapper does not fully emulate a device and thus does not see disk labels
within the LV and does/can not make wedges in the dom0 -- I'm leaning
away from GPT for LVs, btw, esp. if I can fix "vdev=" to explicitly
specify the xbd(4) device number).

Unfortunately I don't currently have enough spare hardware to really do
some proper benchmarking and testing.  I need a new big honkin' server
to free up at least one of these older ones for such playing around.

--
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgpJJj5tKox1P.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index