Re: Unexpected out of memory kills when running parallel find instances over millions of files

To: Manuel Bouyer <bouyer%antioche.eu.org@localhost>, Reinoud Zandijk <reinoud%dropje.13thmonkey.org@localhost>
Subject: Re: Unexpected out of memory kills when running parallel find instances over millions of files
From: Johnny Billquist <bqt%softjar.se@localhost>
Date: Sat, 21 Oct 2023 22:19:49 +0200

I think I reported this something like 20 years ago, but noone reallyseemed to care. I noticed it pretty much right away after NetBSDswitched to the unified memory thing, where all free memory usually wasgrabbed as disk cache. It was not fun on VAX, but at the time it seemother platforms didn't suffer enough to consider it a problem. I guessover time it's just gotten worse...


  Johnny

On 2023-10-21 13:01, Manuel Bouyer wrote:

On Fri, Oct 20, 2023 at 10:26:05PM +0200, Reinoud Zandijk wrote:

Hi,

On Thu, Oct 19, 2023 at 11:20:02AM +0200, Mateusz Guzik wrote:

Running 20 find(1) instances, where each has a "private" tree with
million of files runs into trouble with the kernel killing them (and
others):
[   785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap
[   785.194378] UVM: pid 2010.2010 (find), uid 0 killed: out of swap
[   785.224675] UVM: pid 1771.1771 (top), uid 0 killed: out of swap
[   785.285291] UVM: pid 1960.1960 (zsh), uid 0 killed: out of swap
[   785.376172] UVM: pid 2013.2013 (find), uid 0 killed: out of swap
[   785.416572] UVM: pid 1760.1760 (find), uid 0 killed: out of swap
[   785.416572] UVM: pid 1683.1683 (tmux), uid 0 killed: out of swap

This should not be happening -- there is tons of reusable RAM as
virtually all of the vnodes getting here are immediately recyclable.

$elsewhere I got a report of a workload with hundreds of millions of
files which get walked in parallel -- a number high enough that it
does not fit in RAM on boxes which run it. Out of curiosity I figured
I'll check how others are doing on the front, but key is that this is
not a made up problem.


I can second that. I have had UVM killing my X11 when visiting millions of
files; it might have been using rump but I am not sure.

What struck me was that swap was maxed out but systat showed something like
40gb as `File'. I haven't looked at the Meta percentage but it wouldn't
surpise me if that was also high. Just some random snippet:


I've seen it too, although it didn't end up killing processes.
But the nightly jobs (usual daily/security+ backup) ends up pushing to
swap lots of processes, while the file cache grows to more than half the
RAM (I have 16Gb). As a result the machine is really slow and none of the
nightly jobs complete before morning.

Decreasing kern.maxvnodes helps a lot.


--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol

References:
- Unexpected out of memory kills when running parallel find instances over millions of files
  - From: Mateusz Guzik
- Re: Unexpected out of memory kills when running parallel find instances over millions of files
  - From: Reinoud Zandijk
- Re: Unexpected out of memory kills when running parallel find instances over millions of files
  - From: Manuel Bouyer

Prev by Date: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Next by Date: [PATCHES] Adding Xorg libdrm rst2man(1) translated man pages
Previous by Thread: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Next by Thread: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Indexes:

Home | Main Index | Thread Index | Old Index