Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: Garance A Drosihn <drosih@rpi.edu>
From: Matthew Dillon <dillon@apollo.backplane.com>
List: tech-userlevel
Date: 07/14/1999 18:29:50
:For the moment I'll pretend that you honestly think that is an
:answer, and I'll note that the very same machine may have well
:over 100 processes each of which takes 1-2 meg of memory. If
:the machine hits a really-out-of-memory error, I would be much
:much happier to see all 100+ of those processes killed, at once,
:than the one 40-meg process.
:
:Now tell me how I fix my swap under those circumstances. If
:the answer is "buy infinite memory (ram or disk)", then we don't
:need any overcommit policy in the first place. Note that the
:problem might be that these 100 processes start taking up 5 or
:10 meg than the 2 meg I'm used to.
Everything scales. If the load on your machine is such
that you have hundreds of processes taking 1-2MB of memory,
then lets assume that such a machine has a reasonable
memory configuration of, say, 256MB of ram, and a reasonable
swap configuration of, say, 1GB. Under normal operating
conditions perhaps 100MB might be swapped out, giving you
900MB of margin. The actual VM footprint on such a machine
might run on the order of 10 GB (rough guess) of which 350MB
or so has actually been allocated).
With 900MB of margin - which I might add is only about $30 worth
of disk space, and reasonable process limits, it seems highly
unlikely that the machine will ever run out of swap, even
if a user makes an honest mistake. I also rather seriously
doubt that a hostile user would have any more or less success
blowing away your process with the non-overcommit model verses
otherwise.
If 1G isn't enough, spend another $30 and throw 2G of swap
online. Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.
The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability. Plus
you have to realize that with either model if you are talking
about saving your work, the same code that does the save-and-exit
in the non-overcommit model can just as easily do a checkpoint
once an hour in the standard overcommit model. Code that
can't save/checkpoint would not survive either model.
Disk is cheap. Memory isn't (though it's getting better).
Everything scales.
:I didn't mean to be casting asperisions on the general idea of
:overcommitting, or whatever it is that has your shorts all tied
:up in a knot.
:
:---
:Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu
:Senior Systems Programmer or drosih@rpi.edu
-Matt
Matthew Dillon
<dillon@backplane.com>