Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: Charles M. Hannum <root@ihack.net>
From: Matthew Dillon <dillon@apollo.backplane.com>
List: tech-userlevel
Date: 07/13/1999 19:28:49
:> swap. How much swap is on this system, by the way?
:
:I could just as rightfully argue that you're blaming a failure of the
:OS on the sysadmin. Fiddling with limits is all fine and dandy, but
:it's not even close to flexible enough. Consider, for example, the
:specific case of testing a new multi-threaded program. A simple
:mistake caused it to chew up a rather considerable amount of memory --
:the per-process limit for each of 32 processes. You could claim
:several things here:
:
:* I should have tested it on another system. That's great, but at $Nm
: per system, that's often infeasible.
:
: Not only that, but it's insulting. Why should I have to buy two
: computers, just because the OS can't be bothered to properly protect
: my important programs?
Yes, it is certainly possible that this could happen. Not likely,
but possible. But a non-overcommit is not necessarily going to solve
this problem. Your processes could very well obtain all the resources
and some other poor user running his 1000 hour simulation will try to
allocate something, fail, and go poof.
All sorts of problems crop up. For example, if programs cannot handle
a malloc failure one might think that all they need to do is to block
in the allocation, waiting for memory to become available. The result
of that could be a system-wide deadlock! If a program has the capability
to checkmark itself that is all well and fine, but then it is something
the program could be doing anyway at regular intervals (like once every
2 hours). The amount of lost work would be minimal in both cases.
Limits do work. Quite well, in fact. What you do is simply set limits
that prevent the more common accidents. For example, it is far more
likely that only a few processes will runaway and try to eat their entire
address space. So it might be reasonable to set a softlimit of
32 processes and a 64MB address space per process. If a user needs more
he ups his limits manually and thus taking on more responsibility.
If you have several users doing memory-heavy work, you need a lot of
swap no matter what resource model you use. It might be appropriate to
run 4 or 8 GB of swap in that case, though personally I doubt you'd ever
need that much. It depends on how much main memory you have, of course.
A machine with 4G of ram might want 8G or even 16G of swap. A machine
with 128MB of ram would be hard pressed to even begin to utilize 1GB
of swap even with a dozen runaway programs running.
If you are truely paranoid it costs about $150 for a 6G IDE hard drive.
That's a lot of swap space!
:* I should have allocated enough swap space to cover this situation.
: That's great, but if I did, using a no-overcommit policy would have
: worked just as well!
Not necessarily. The system could have had room for your processes
but not room for someone else's, causing the other person's script
to fail for no good reason.
no-overcommit policies tend to need a much greater amount of swap
to yield the same performance and reliability. If you had that
much extra swap available to begin with, it would probably have been
better to stick with the standard overcommit policy and simply add
swap.
The arbitrary nature of an overcommit policy is no better then
the almost-arbitrary nature of the existing policy, except that
what we currently have specifically targets the largest processes
rather then 'any' process.
:The point is, the OS should have provided *some* mechanism to insure
:that the long-running process wasn't affected. It didn't. That's a
:clear failure of the OS to provide a reasonable environment for this
:type of computing.
:
:Whether this should be solved by switching to a no-overcommit policy,
:fiddling with the overcommit policy in some way, or whatever, is a
:different issue. But you have not yet proposed any mechanism that
:would have prevented this problem while still permitting me to get
:work done.
The OS needs to provide no such thing. The OS kills processes
as an absolute last resort. It does it when it believes it has
no other choice. If you don't want to get to that point there
are plenty of ways to avoid it... but the policy is something
that you have to implement, the OS can't do it for you.
The most common way of doing this is through watcher scripts.
The watcher script looks at the memory situation and finds things
to kill if it gets critical. It is not that difficult to write
a watcher script, but most people don't bother because most people
don't have swap problems to begin with. It would be fairly easy
for a watcher script to catch a system heading towards swap exhaustion
because it generally takes a while to get into the swap exhaustion
state.
-Matt
Matthew Dillon
<dillon@backplane.com>