Subject: Re: RAIDFrame and RAID-5
To: NetBSD-current Discussion List <current-users@NetBSD.org>
From: Frederick Bruckman <fredb@immanent.net>
List: current-users
Date: 10/28/2003 20:24:19
On Tue, 28 Oct 2003, Greg A. Woods wrote:
> [ On Monday, October 27, 2003 at 15:31:21 (-0600), Frederick Bruckman wrote: ]
> > Subject: Re: RAIDFrame and RAID-5
> >
> > That should give the maximum of 64MB of kvm pages (16K pages of 4K
> > each), which sounds like plenty, but I suppose it could have become
> > too fragmented.
>
> It seems I have:
>
> vm.nkmempages = 20455
> vm.anonmin = 10
> vm.execmin = 5
> vm.filemin = 10
> vm.maxslp = 20
> vm.uspace = 8192
> vm.anonmax = 80
> vm.execmax = 30
> vm.filemax = 50
>
> Isn't that almost 80MB, i.e. bigger than the "maximum"?
Yah, 80MB seems like plenty, but then your usage pattern called for
some really big allocations in the RAID and anon rows.
Those "min" and "max" values are percentages for managing use of the
unified buffer cache, not the kernel memory allocator.
> (I don't have any KMEMPAGES* options in my kernel)
>
> > Hey, you're not swapping to RAID-5 are you? That
> > configuration is known to cause problems.
>
> No, but I am swapping to the RAID-1 (i.e. raid2b in my config):
>
> $ /sbin/swapctl -l
> Device 512-blocks Used Avail Capacity Priority
> /dev/raid2b 2097152 530648 1566504 25% 0
>
> (sorry I should have included that info before!)
So you are doing quite a bit of swapping. Does "top" sorted by "size"
show the same applications getting paged out and paged back in
constantly? I have seen a machine with only 16mb take 4 hours to boot,
resolved by raising vm.execmin to 10. You can play with those numbers
via "sysctl" while running. That would not make you run out of mbufs,
though.
> > Memory resource pool statistics
> > Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
> > mbpl 256 19224 10295 19087 203 186 17 44 1 inf 0
> > ^^^^^
> > |||
> >
> > Greg ran out of MBUFS, all right.
>
> Hmm.... interesting. I somehow missed seeing that entry in the fail column.
>
> Given that the fail count has not changed since (though I've not really
> stressed the system much since either), I would indeed guess that those
> failures coincided with the freeze.
>
> Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
> mbpl 256 35588 10295 35400 219 198 21 44 1 inf 3
>
> > What I would try, is to increase NKMEMPAGES until the problem isn't
> > reproducable anymore.
>
> I will try to do so, though I'm losing access to the second RAID-5
> (/save) very soon -- the disks must be moved back to the system they
> belong with to be put into production for a mail spool.
>
>
> BTW, what do we need to do to be able to report mbuf peak and max use as
> FreeBSD does?
>
> $ netstat -m
> 387/1056/26624 mbufs in use (current/peak/max):
> 386 mbufs allocated to data
> 1 mbufs allocated to packet headers
> 384/502/6656 mbuf clusters in use (current/peak/max)
> 1268 Kbytes allocated to network (6% of mb_map in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
"options MBUFTRACE". See options(4).
Frederick