Subject: Re: Replacement for grep(1) (part 2)
To: Matthew Dillon <dillon@apollo.backplane.com>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-userlevel
Date: 07/14/1999 10:35:29
    Date:        Tue, 13 Jul 1999 14:14:52 -0700 (PDT)
    From:        Matthew Dillon <dillon@apollo.backplane.com>
    Message-ID:  <199907132114.OAA80781@apollo.backplane.com>

  |     If you don't have the disk necessary for a standard overcommit model to
  |     work, you definitely do not have the disk necessary for a non-overcommit 
  |     model to work.

This is based upon your somewhat strange definition of "work".   I assure
you that I have run many systems which don't use overcommit, and which I
quite frequently run into "out of VM" conditions, and which I can assure
you, work just fine.   When they're getting to run out of VM, the system
is approaching paging death, which is as you'd expect (they're overloaded).
That is, adding more VM (more swap space) would be counterproductive.

When this stage is reached, the absolute prime requirement of "working"
is met though - applications that request memory get that request refused,
but absolutely no processes get ungracefully killed.

In a sense, no-one really cares what the page allocation policy is, the
argument here isn't about overcommit, or the very conservative early BSD
version, or any of the intermediate possibilities - all people really care
about is what happens when resources are exhausted.   What happens until
then no-one really cares about (there are some issues of how much space
you need to dedicate to paging - most people would probably prefer to
not use the early BSD method, where you needed at least as much paging space
as RAM, or some of your RAM simply would be left idle).

But one absolute requirement for any system that wants to consider itself
to be a reliable useable, general purpose system, is that it never simply
randomly kill processes of its own volition.   If you're happy for random
processes to be killed on your workstation, that's fine, I'm not.   I run
processes which are intended to do specific work, they're not intended to
simply go away just because memory is running low (there are other processes,
stupid perl scripts and such, which will quite quickly die when a mem
request is refused, and return resources, so the processes that matter,
which can be very large, can keep on processing).

I have no doubt but that you can dream up scenarios where you pander to
the laziness of programmers, and make using huge VM space with little
of it actually allocated anywhere (or ever touched) then you would indeed
need monstrous amounts of paging space, most of which is never actually
used for anything - personally I prefer to have the programmers think
a little more about the memory footprint of their data structures.  Not
only does this reduce the VM footprint, it will also usually vastly
improving the paging characteristics.   Most applications which simply
scatter data through a huge VM space simply stop being useable as soon
as their RSS exceeds available physical memory - that is, if they start
paging, they die (become comatose might be a better description).
A little intelligent though as to how to actually make use of the mem
resources can make a huge difference.

There was an earlier comment on this thread (which no longer has the slightest
thing to do with the new version of grep...) which mentioned fortran
programs.   People, fortran (and huge fortran programs) has been around
much longer than VM has been.   There are lots of techniques for fortran
programmers to use to make use of restricted memory sizes, they've been
managing that for decades.

kre