Subject: Re: NetBSD1.6 UVM problem?
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Oleg Polyanski <Oleg.Polianski@team.telstraclear.co.nz>
List: tech-kern
Date: 12/11/2002 00:19:12
>>>>> "der" == der Mouse writes:
der> You call mlockall(), or if you can, mlock(), to lock all
der> relevant pages into core. Or else you impose resource limits
der> low enough compared to the amount of swap you provide that you
der> _can't_ run out.
der> ...or else you risk, yes, as you say, getting a critical
der> process nuked at an unpredictable time.
`mlock' only provides you with a way to find out if there is still
any physical memory available for allocation in the system.
Moreover, not every VM use case requires locking of all the pages
in RAM. By using `mlock' you effectively prevent a process or its
certain pages from being paged out. If that process keeps
naturally growing, not because of a memory leak buried somewhere
deeply inside, this approach is rather ineffective, especially
when the process doesn't need to have all its pages permanently
residing in RAM. Think of databases, for example. Yes, they do
lock in some pages. No, they don't pin down their every page.
>> You end up with a killed process and perhaps lost data only
>> because your kernel could not say anything about memory
>> starvation rather killing the first found process happened to be
>> the largest one.
der> Right. But what else is there to do? I can see only about
der> four reasonable things to do when you're out of RAM when
der> servicing a page fault:
der> 1) Deliver a signal to the process.
der> 2) Kill the process.
der> 3) Stall: make the process wait a little and hope the
der> situation eases.
der> 4) "When in danger or in doubt / Run in circles, scream and
der> shout": stall the faulting process and kick someone else,
der> probably a task-manager of some kind (presumably small and
der> locked in core).
der> For (1) to be useful, the process must have arranged to handle
der> the signal without incurring further page faults, which is
der> possible but nontrivial (and even more difficult to do without
der> just locking everything in core, in which case you won't get
der> the page faults anyway). Most processes don't have much they
der> can do to ease a serious RAM crunch to start with; to do it
der> right you really want a malloc-alike that lets you specify
der> different areas to allocate out of for different objects, so
der> noncritical objects and critical objects can live in different
der> pages.
(2) we've got now and it's acceptable, (3) and (4) don't look
feasible to me in most cases, while a sensible combination of (1)
and (2) was described in my last email.
(1): Locking of the signal handler code shouldn't be a problem,
it's relatively small when not abused. The real benefit of the AIX
approach is that the SIGDANGEROUS signal is serviced only by the
software that really wants to be aware of VM starvation, for
everybody else it's simply invisible and ignored by default.
Oleg