Subject: amd64 low-memory freezes -- possible culprit?
To: None <fvdl@netbsd.org, port-amd64@netbsd.org>
From: Christopher SEKIYA <sekiya@netbsd.org>
List: port-amd64
Date: 05/29/2004 11:03:13
I've been digging into the "amd64 freezes for many seconds under low memory
conditions", and I think I've got a line on problem.
The test system has 3/4 gigs of RAM, with /tmp being a mfs, using a profiling
kernel. I've filled up the file buffer such that top reports:
Memory: 457M Act, 234M Inact, 4380K Wired, 16M Exec, 590M File, 13M Free
(numbers fudged a bit as the above was copied a bit after the test -- the
"13M Free" should be around 2048k)
Let's see what happens when we run it out of memory:
[10:51:17] monkey:/$ kgmon -b; dd if=/dev/zero of=/tmp/s6 bs=1024k count=20; kgmon -h; kgmon -p; kgmon -r
kgmon: kernel profiling is running.
/tmp: write failed, file system is full
dd: /tmp/s6: No space left on device
11+0 records in
10+0 records out
10485760 bytes transferred in 33.624 secs (311853 bytes/sec)
kgmon: kernel profiling is off.
kgmon: kernel profiling is off.
kgmon: kernel profiling is off.
(We can ignore the "file system full" message, as the system paused for half
a minute trying to reorganize itself).
gprof says:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
63.95 7.70 7.70 1002719489 0.00 0.00 pmap_pdes_valid
35.38 11.96 4.26 380 11.21 31.47 pmap_do_remove
0.33 12.00 0.04 4775 0.01 0.01 copyout
0.17 12.02 0.02 1760 0.01 0.01 copyin
0.08 12.03 0.01 16689 0.00 0.00 pmap_clear_attrs
In contrast, a test case that _doesn't_ pause results in the following
gprof output:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
51.52 0.17 0.17 20575 0.01 0.01 copyout
36.36 0.29 0.12 6860 0.02 0.02 copyin
I'm guessing that the code that implements the four-level page table is causing
the hangs.
Comments? Thoughts?
-- Chris
GPG key FEB9DE7F (91AF 4534 4529 4BCC 31A5 938E 023E EEFB FEB9 DE7F)