Subject: Re: unified buffers and responsibility
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Milos Urbanek <urbanek@openbsd.cz>
List: tech-kern
Date: 06/13/2002 13:05:29
On Wed, Jun 12, 2002 at 10:55:56PM +0200, Manuel Bouyer wrote:
> > >
> > > Has anyone idea of what is causing this ?
> > > I suspect it could be disksort, which cause a single request to one end of
> > > the disk be delayed to group dozen of requests on the other end of the disk.
> >
> > if so - apps that doesn't use disk at all for sure (like text editor)
> > should work fine. but it isn't - delays are small but noticable (up to 1
> > second).
>
> it looks like file activity is still pushing out some data from RAM,
> even if it should not.
I have similar problems at NetBSD 1.5ZC, performing a backup of /home
directory to the tgz file at the same disk leads to a hard lockup,
similarly 'cp big_file somewhere'. I cant
tell you if there is something at ddb prompt because the whole machine
is completely inresponsive and whenever it happened to me it was running
X.
The response time of other apps is slow aswell during the
copy/tar/whatsever 'larger' disk IO - including Window Manager, X itself, xterms
and other apps, not only those like netscape.
I suspect some pages of those apps are going to swap during the intensive
buffer cache operations, but i do not have a quantitative measures to collect
info about how many times the page daemon had woken up.
Mhm. Now it happened again - during untaring memory image that was
generated after the crash a week ago. So I can append usefull parts of
vmstat output.
Another symptoms:
top shows about
Memory: 68M Act, 34M Inact, 3744K Wired, 5084K Exec, 52M File, 268K Free
Swap: 513M Total, 23M Used, 491M Free
during the command
gzip -d netbsd.0.core in /var/crash
imediately after the command completes the statistics are as follows:
Memory: 62M Act, 23M Inact, 3904K Wired, 5084K Exec, 35M File, 18M Free
Swap: 513M Total, 23M Used, 491M Free
when I compare output from vmstat after the first 'gzip' command with the
output done after the another two 'gzip' cmds were run, I get the
following:
vmstat -ms after the first command gzip netbsd.0.core.gz:
In use 2889K, total allocated 3784K; utilization 76.3%
4096 bytes per page
8 page colors
31335 pages managed
3554 pages free
15590 pages active
6909 pages inactive
0 pages paging
979 pages wired
0 zero pages
1 reserve pagedaemon pages
5 reserve kernel pages
12478 anonymous pages
9605 cached file pages
1296 cached executable pages
64 minimum free pages
85 target free pages
8671 target inactive pages
10445 maximum wired pages
1 swap devices
131417 swap pages
5427 swap pages in use
1205 swap allocations
160711 anons
143148 free anons
2958228 total faults taken
105231261 traps
32436282 device interrupts
149273095 cpu context switches
102736567 software interrupts
631199647 system calls
1138 pagein requests
463 pageout requests
195 swap ins
211 swap outs
0 pages swapped in
6849 pages swapped out
5179 forks total
932 forks blocked parent
945 forks shared address space with parent
0 pagealloc zero wanted and avail
2519223 pagealloc zero wanted and not avail
0 aborts of idle page zeroing
2689138 pagealloc desired color avail
4333 pagealloc desired color not avail
7 faults with no memory
0 faults with no anons
0 faults had to wait on pages
0 faults found released page
6653 faults relock (6652 ok)
268366 anon page faults
1137 anon retry faults
63038 amap copy faults
83360 neighbour anon page faults
611514 neighbour object page faults
200225 locked pager get faults
5515 unlocked pager get faults
223744 anon faults
44054 anon copy on write faults
176321 object faults
23904 promote copy faults
2468048 promote zero fill faults
313 times daemon wokeup
225 revolutions of the clock hand
225 times daemon attempted swapout
9 pages freed by daemon
211785 pages scanned by daemon
6750 anonymous pages scanned by daemon
55320 object pages scanned by daemon
89969 pages reactivated
0 pages found busy by daemon
463 total pending pageouts
238759 pages deactivated
20870664 total name lookups
cache hits (88% pos + 5% neg) system 1% per-process
deletions 0%, falsehits 0%, toolong 0%
vmstat -ms after another two commands gzip [-d] netbsd.0.core.gz:
In use 2901K, total allocated 3700K; utilization 78.4%
4096 bytes per page
8 page colors
31335 pages managed
4339 pages free
15938 pages active
5803 pages inactive
0 pages paging
979 pages wired
0 zero pages
1 reserve pagedaemon pages
5 reserve kernel pages
12341 anonymous pages
9009 cached file pages
1271 cached executable pages
64 minimum free pages
85 target free pages
8686 target inactive pages
10445 maximum wired pages
1 swap devices
131417 swap pages
5804 swap pages in use
1253 swap allocations
160711 anons
142875 free anons
2986284 total faults taken
105333020 traps
32526681 device interrupts
149362353 cpu context switches
102815088 software interrupts
631704542 system calls
1138 pagein requests
497 pageout requests
264 swap ins
280 swap outs
0 pages swapped in
7307 pages swapped out
5190 forks total
933 forks blocked parent
946 forks shared address space with parent
0 pagealloc zero wanted and avail
2520483 pagealloc zero wanted and not avail
0 aborts of idle page zeroing
2774459 pagealloc desired color avail
5958 pagealloc desired color not avail
7 faults with no memory
0 faults with no anons
0 faults had to wait on pages
0 faults found released page
7072 faults relock (7071 ok)
272969 anon page faults
1137 anon retry faults
63191 amap copy faults
84322 neighbour anon page faults
614747 neighbour object page faults
201536 locked pager get faults
5934 unlocked pager get faults
228230 anon faults
44171 anon copy on write faults
177582 object faults
23954 promote copy faults
2469138 promote zero fill faults
563 times daemon wokeup
475 revolutions of the clock hand
475 times daemon attempted swapout
9 pages freed by daemon
440512 pages scanned by daemon
7160 anonymous pages scanned by daemon
125233 object pages scanned by daemon
139412 pages reactivated
0 pages found busy by daemon
497 total pending pageouts
474257 pages deactivated
20875826 total name lookups
cache hits (88% pos + 5% neg) system 1% per-process
deletions 0%, falsehits 0%, toolong 0%
so there are about another 200 wake ups of the page daemon and about
another 70 swapouts etc etc..
I think the sysctl values for buff cache should be tuned a bit or is there another
solution? Btw. the lock could be related to frequent swap ins/outs.
It happens to me only when my machine is swapping long enough time.
Milos
PS:
I have something like
vm.nkmempages = 8163
vm.anonmin = 10
vm.execmin = 5
vm.filemin = 10
vm.maxslp = 20
vm.uspace = 8192
vm.anonmax = 80
vm.execmax = 30
vm.filemax = 50
and the kernel derived directly from GENERIC after commenting out a few
drivers.
NetBSD 1.5ZC (OAKLAND) #12: Fri Apr 19 09:24:26 UTC 2002
root@oakland:/usr/src/sys/arch/i386/compile/OAKLAND
cpu0: AMD Duron (686-class), 800.10 MHz
cpu0: I-cache 64 KB 64b/line 2-way, D-cache 64 KB 64b/line 2-way
cpu0: L2 cache 64 KB 64b/line 16-way
cpu0: features 183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features 183f9ff<PGE,MCA,CMOV,FGPAT,PSE36,MMX>
cpu0: features 183f9ff<FXSR>
total memory = 127 MB
avail memory = 114 MB
using 1658 buffers containing 6632 KB of memory
BIOS32 rev. 0 found at 0xfb310
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: VIA Technologies VT8363 KT133 System Controller (rev. 0x03)
agp0 at pchb0: aperture at 0xd0000000, size 0x10000000
ppb0 at pci0 dev 1 function 0: VIA Technologies VT8363 KT133 PCI to AGP
Bridge (
rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga1 at pci1 dev 0 function 0: ATI Technologies Rage XL (AGP) (rev. 0x65)
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
wsdisplay0: screen 1-7 added (80x25, vt100 emulation)
pcib0 at pci0 dev 7 function 0
pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.
0x40)
pciide0 at pci0 dev 7 function 1: VIA Technologies VT82C686A (Apollo
KX133) ATA1
00 controller
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-371350>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 12949 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 26520480
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA
data
transfers)
pciide0: secondary channel configured to compatibility mode
atapibus0 at pciide0 channel 1: 2 targets
cd0 at atapibus0 drive 0: <CRD-8482B, , 1.05> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
pciide0: secondary channel interrupting at irq 15
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA
data
transfers)
uhci0 at pci0 dev 7 function 2: VIA Technologies VT83C572 USB Controller
(rev. 0
x16)
uhci0: interrupting at irq 9
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 7 function 3: VIA Technologies VT83C572 USB Controller
(rev. 0
x16)
uhci1: interrupting at irq 9
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pchb1 at pci0 dev 7 function 4
pchb1: VIA Technologies VT82C686A SMBus Controller (rev. 0x40)
auvia0 at pci0 dev 7 function 5: VIA VT82C686A AC'97 Audio (rev 0x50)
auvia0: interrupting at irq 5
auvia0: ICE17 codec; headphone, 18 bit DAC, 18 bit ADC, Unknown 3D
audio0 at auvia0: full duplex, mmap, independent
ex0 at pci0 dev 18 function 0: 3Com 3c905C-TX 10/100 Ethernet with mngmt
(rev. 0
x74)
ex0: interrupting at irq 11
ex0: MAC address 00:01:02:db:4f:a5
ukphy0 at ex0 phy 24: Generic IEEE 802.3u media interface
ukphy0: Broadcom 3c905C internal PHY (OUI 0x000818, model 0x0017), rev. 6
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask e745 netmask ef45 ttymask ffc7
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
IP Filter: v3.4.25 initialized. Default = pass all, Logging = enabled
>
> --
> Manuel Bouyer <bouyer@antioche.eu.org>
> --
>
--