Subject: Re: Low AAC performance but only when tested through the file system
To: Olaf Seibert <rhialto@polderland.nl>
From: Greg 'groggy' Lehey <grog@NetBSD.org>
List: port-i386
Date: 12/03/2003 13:43:30
--Li7ckgedzMh1NgdW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Thursday, 27 November 2003 at 18:02:52 +0100, Olaf Seibert wrote:
> I am still struggling with the performance of an Adaptec 2120S RAID
> controller (aac0). When testing with bonnie++ (from pkgsrc), write
> performance is only around 4M/sec. The number of xfers/second, as shown
> by `systat vmstat', is only around 60 while it is writing and only
> around 100 when reading. This is surprisingly low.
As a disk performance test, bonnie is surprisingly misleading.
bonnie++ is probably better, but it still requires a lot of
interpretation.
> However, when I test with a benchmark called RawIO (see
> http://www.acnc.com/benchmarks.html) on the (unused) raw swap partition,
> I get much better results. Sequential writing is around 35M/sec, and the
> number of xfers/sec is over 1000, peaking at some 1900.
This is surprisingly high. You can assume that you're not getting
valid results.
On Saturday, 29 November 2003 at 15:45:00 +0100, Olaf Seibert wrote:
>
> Output from rawio:
>
> bash-2.05b# ./rawio -s 1g 1g -a /dev/rld0b
> Random read Sequential read Random write Sequential write
> ID K/sec /sec K/sec /sec K/sec /sec K/sec /sec
> ld0b 15587.8 950 32273.5 1970 15127.1 933 31824.6 1942
Look at the -s parameter there. You're only accessing 1 GB of the
total disk space. IIRC these things have a cache memory of 128 MB, so
you're probably hitting the cache most of the time.
rawio is in fact a pretty pessimistic benchmark. Even the sequential
I/O tests tend to give poor results, because there are several
processes accessing different parts of the disk in parallel. As a
result, you still get the seek latency. This is what gives away the
results. Here are some results done with Vinum on some rather old
disks with varying stripe sizes. They don't look very good, but in
fact they're about as much as the hardware can handle. Note
particularly that the sequential I/O rate is not significantly higher
than the random rate. These tests were done with other stuff going on
on the machine, which will probably explain some of the anomalies in
the values (sequential read for stripe size 512 kB, for example).
Random read Sequential read Random write Sequential write
ID K/sec /sec K/sec /sec K/sec /sec K/sec /sec
s.1k 1476.6 90 1341.3 82 1469.6 90 1310.7 80=09
s.2k 1543.3 92 1598.3 98 1498.7 92 1303.3 80=09
s.4k 2728.2 162 2103.5 128 2717.5 165 2477.6 151=09
s.8k 3859.4 227 3792.5 231 3835.8 227 3900.2 238=09
s.16k 4631.8 280 5454.9 333 4527.0 283 5202.0 318=09
s.32k 5196.8 314 6515.4 398 5270.0 317 6491.2 396=09
s.64k 5730.2 347 5833.6 356 5644.9 347 7685.2 469=09
s.128k 5961.6 365 8233.7 503 6012.6 358 8772.9 535=09
s.256k 5879.5 352 6767.5 413 6018.0 355 6701.5 409=09
s.512k 5799.6 347 3195.0 195 5973.6 358 7956.1 486=09
s.1024k 5984.8 368 5031.5 307 6191.2 372 4376.1 267=09
On Tuesday, 2 December 2003 at 16:45:30 +0100, Olaf Seibert wrote:
> On Mon 01 Dec 2003 at 17:57:30 -0800, Bill Studenmund wrote:
>> On Mon, 1 Dec 2003 17:57:30 -0800, Bill Studenmund wrote:
>> On Mon, Dec 01, 2003 at 02:50:52AM +0100, Olaf Seibert wrote:
>>> On Sun 30 Nov 2003 at 15:07:33 -0800, Bill Studenmund wrote:
>>>> What's your stripe depth? For optimal performance, you want it to be 1=
6k.
>>>> The file system will do 64k i/o's. With 4 data drives & a 16k stripe
>>>> depth, a 64k i/o (on a 64k boundary) will hit all 4 drives at once.
>>>
>>> It's the default 64k. In the previous hardware I tried 16k also but it
>>> made no difference, so I never tried it on this one.
>>
>> With a stripe depth of 64k (stripe width of 256k) and 64k i/os you will
>> get poor performance. If you are performing random i/o, each of those 64k
>> writes means READING 3 * 64k =3D 192k then writing 128k.
>>
>> Assuming the stripe depth really is 64k, each write by the OS will
>> turn into a read/modify/write operation in the RAID card. You'll be
>> getting worse performance than if you just used one drive.
>
> Isn't that simply always the case?
Pretty much.
> I'm not sure how write size and stripe size influence this unless
> the RAID controller is exceedingly stupid and re-creading the parity
> for the whole stripe if only a single sector of it changes.
I'd be interested to know if that's the case. I think it might be
possible.
I did a lot of investigation of UFS I/O behaviour a few years back
when I was writing Vinum. That's one of the things I considered when
writing rawio: unless you're doing things like sequential file copies,
your I/O is surprisingly seldom an exact block. It could be any
number of sectors with any alignment. The only way a RAID controller
can optimize that is by coalescing requests, as you suggest. I found
it so unlikely to make any differences that I didn't implement it in
Vinum.
Greg
--
See complete headers for address and phone numbers.
--Li7ckgedzMh1NgdW
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (FreeBSD)
iD8DBQE/zVTaIubykFB6QiMRAjQHAJ9TMzsLNqhtSvwQ/zAu2i1MTLYO4QCdFjbn
vntkyBlPszDmHRt3lfRYL7k=
=GMXY
-----END PGP SIGNATURE-----
--Li7ckgedzMh1NgdW--