Subject: kern/10765: long "freeze" when killing processes that cause heavy paging
To: None <gnats-bugs@gnats.netbsd.org>
From: Simon Burge <simonb@wasabisystems.com>
List: netbsd-bugs
Date: 08/06/2000 00:51:18
>Number: 10765
>Category: kern
>Synopsis: long system freeze when killing processes that cause heavy paging
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Aug 06 00:52:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Simon Burge
>Release: NetBSD-current 20000806 sources
>Organization:
Wasabi Systems
>Environment:
System: NetBSD wincen 1.5D NetBSD 1.5D (WINCEN) #328: Sun Aug 6 13:51:23 EST 2000
simonb@wincen:/usr/obj/sys/arch/i386/compile/WINCEN i386
>Description:
The system appears to freeze for a while (90 seconds in my case)
when killing processes that cause heaving paging. When the
system becomes responsive again, other similar processes don't
continue.
>How-To-Repeat:
On an i386 with 128MB of RAM (116MB available), run three copies
of the following program with 24576 as an argument:
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
main(int argc, char **argv)
{
char **foo;
int i, j, size;
srand(getpid() ^ time());
if (argc > 1)
size = atoi(argv[1]);
else
size = 0;
if (size > 0) {
foo = (char **)malloc(size * sizeof(char *));
if (foo == NULL)
errx(1, "no memory\n");
for (i = 0; i < size; i++) {
foo[i] = malloc(4096);
foo[i][0] = 0;
}
while (1) {
i = rand() % size;
foo[i][0] = 0;
}
}
}
When one of these is killed, the system appears to wedge. From
DDB, the trace of the killed process is:
db> t/t0t288
cpu_Debugger(c0570c60,c0776540,c0570540,c01233ed,0) at cpu_Debugger+0x4
comintr(c0580a00) at comintr+0xcd
Xintr4() at Xintr4+0x70
--- interrupt ---
extent_free(c0570540,176ee,1,10) at extent_free+0x16e
uvm_swap_free(176ef,1,c03c5498,c9c0de74,c01e4898) at uvm_swap_free+0x5b
uvm_anon_dropswap(c9abcd00,463,c9bf44dc,c9c0de88,c01e3f23) at uvm_anon_dropswap+
0x16
uvm_anfree(c9abcd00) at uvm_anfree+0x78
amap_wipeout(c9bf44dc,c9bf3ccc,c9bf3ccc,0,c9c0debc) at amap_wipeout+0x3b
amap_unref(c9bf3ccc,0) at amap_unref+0x1a
uvm_unmap_detach(c9bf3f0c,0,c995ec38,f,c9bf3f0c) at uvm_unmap_detach+0x31
uvm_unmap(c995ec38,0,bfbfe000,c995ec38,c9c0df1c) at uvm_unmap+0xb3
uvm_deallocate(c995ec38,0,bfbfe000) at uvm_deallocate+0x38
exit1(c9be1968,f,f,c9be1968,c9be8354) at exit1+0x13f
sigexit(c9be1968,f,c9be1968,8048a0c,106) at sigexit+0x9e
postsig(f) at postsig+0xab
trap() at trap+0x526
--- trap (number 6) ---
0x8048a0c:
After approx 90 seconds the system comes back to life, but the
two remaining memory hog processes don't continue. A trace of
one on these is:
db> t/t 0t284
trace: pid 284 at 0xc9c03d88
bpendtsleep(c05e3558,11,c02419f5,0,0) at bpendtsleep
biowait(c05e3558,2,13d5b,c9c03f40,9ead8) at biowait+0x31
uvm_swap_io(c9c03e20,13d5b,1,100000,c0410208) at uvm_swap_io+0x226
uvm_swap_get(c0410208,13d5b,2,c9c03ee0,c9c03f0c) at uvm_swap_get+0x51
uvmfault_anonget(c9c03f40,c9bf439c,c9ad4db0,d98f000,0) at uvmfault_anonget+0x188
uvm_fault(c995ea10,d98f000,0,3,0) at uvm_fault+0x989
trap() at trap+0x409
--- trap (number 6) ---
0x8048a0c:
Killing the remaining memory hogs each results in a similar
"freeze".
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: