tech-userlevel: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: John Nemeth <jnemeth@victoria.tc.ca>
From: Matthew Dillon <dillon@apollo.backplane.com>
List: tech-userlevel
Date: 07/14/1999 21:41:25

:
:On Jul 15, 12:20am, "Daniel C. Sobral" wrote:
:} "Charles M. Hannum" wrote:
:} > 
:} > That's also objectively false.  Most such environments I've had
:} > experience with are, in fact, multi-user systems.  As you've pointed
:} > out yourself, there is no combination of resource limits and whatnot
:} > that are guaranteed to prevent `crashing' a multi-user system due to
:} > overcommit.  My simulation should not be axed because of a bug in
:} > someone else's program.  (This is also not hypothetical.  There was a
:} > bug in one version of bash that caused it to consume all the memory it
:} > could and then fall over.)
:} 
:} In which case the program that consumed all memory will be killed.
:} The program killed is +NOT+ the one demanding memory, it's the one
:} with most of it.
:
:     On one system I administrate, the largest process is typically
:rpc.nisd (the NIS+ server daemon).  Killing that process would be a
:bad thing (TM).  You're talking about killing random processes.  This
:is no way to run a system.  It is not possible for any arbitrary
:decision to always hit the correct process.  That is a decision that
:must be made by a competent admin.  This is the biggest argument
:against overcommit:  there is no way to gracefully recover from an
:out of memory situation, and that makes for an unreliable system.
:
:}-- End of excerpt from "Daniel C. Sobral"

    ... and the chance of that system running out of swap space
    is?  

    The machine has hit the wall, the admin can't login.  What 
    is the kernel to do?

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>