NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
zfs pool behavior - is it ever freed?
I'm having trouble with zfs causing a system to run out of memory, when
I think it should work ok. I have tried to err on the side of TMI.
I have a semi-old computer (2010) that is:
netbsd-10
amd64
8GB RAM
1T SSD
cpu0: "Pentium(R) Dual-Core CPU E5700 @ 3.00GHz"
cpu1: "Pentium(R) Dual-Core CPU E5700 @ 3.00GHz"
and it basically works fine, besides being a bit slow by today's
standards. I am using it as a build and fileserver, heading to
eventually running pbulk, either in domUs or chroots. I have recently
moved 2 physical machines (netbsd-9 i386 and amd64) to domUs; I use
these to build packages for production use. (The machines are 2006 and
2008 mac notebooks, with painfully slow spinning disks and 4G of RAM
each -- but they work.)
wd0 has a disklabel, with / and /usr as normal FFSv2 (a and e), normal
swap on wd0b. wd0f is defined as most of the disk, and is the sole
component of tank0:
#> zpool status
pool: tank0
state: ONLINE
scan: scrub repaired 0 in 0h8m with 0 errors on Tue Jul 4 20:31:03 2023
config:
NAME STATE READ WRITE CKSUM
tank0 ONLINE 0 0 0
/etc/zfs/tank0/wd0f ONLINE 0 0 0
errors: No known data errors
I have a bunch of filesystems, for various pkgsrc branches (created from
snapshots), etc:
NAME USED AVAIL REFER MOUNTPOINT
tank0 138G 699G 26K /tank0
tank0/b0 6.16G 699G 6.16G /tank0/b0
tank0/ccache 24.1G 699G 24.1G /tank0/ccache
tank0/distfiles 35.1G 699G 35.1G /tank0/distfiles
tank0/n0 31.5K 699G 31.5K /tank0/n0
tank0/obj 3.48G 699G 3.48G /tank0/obj
tank0/packages 7.27G 699G 7.27G /tank0/packages
tank0/pkgsrc-2022Q1 130M 699G 567M /tank0/pkgsrc-2022Q1
tank0/pkgsrc-2022Q2 145M 699G 569M /tank0/pkgsrc-2022Q2
tank0/pkgsrc-2022Q3 194M 699G 566M /tank0/pkgsrc-2022Q3
tank0/pkgsrc-2022Q4 130M 699G 573M /tank0/pkgsrc-2022Q4
tank0/pkgsrc-2023Q1 147M 699G 582M /tank0/pkgsrc-2023Q1
tank0/pkgsrc-2023Q2 148M 699G 583M /tank0/pkgsrc-2023Q2
tank0/pkgsrc-current 10.3G 699G 1.14G /tank0/pkgsrc-current
tank0/pkgsrc-wip 623M 699G 623M /tank0/pkgsrc-wip
tank0/u0 1.91M 699G 1.91M /tank0/u0
tank0/vm 49.5G 699G 23K /tank0/vm
tank0/vm/n9-amd64 33.0G 722G 10.1G -
tank0/vm/n9-i386 16.5G 711G 4.38G -
tank0/ztmp 121M 699G 121M /tank0/ztmp
which all feels normal to me.
I used to usually boot this as GENERIC. Now I'm booting xen with 4G:
menu=GENERIC:rndseed /var/db/entropy-file;boot netbsd
menu=GENERIC single user:rndseed /var/db/entropy-file;boot netbsd -s
menu=Xen:load /netbsd-XEN3_DOM0.gz root=wd0a rndseed=/var/db/entropy-file console=pc;multiboot /xen.gz dom0_mem=4096M
menu=Xen single user:load /netbsd-XEN3_DOM0.gz root=wd0a rndseed=/var/db/entropy-file console=pc -s;multiboot /xen.gz dom0_mem=4096M
menu=GENERIC.ok:rndseed /var/db/entropy-file;boot netbsd
menu=Drop to boot prompt:prompt
default=3
timeout=5
clear=1
I find that after doing things like cvs update in pkgsrc, I have a vast
amount of memory in pools:
Memory: 629M Act, 341M Inact, 16M Wired, 43M Exec, 739M File, 66M Free
Swap: 16G Total, 16G Free / Pools: 3372M Used
vmstat -m, sorted by Npage and showing > 1E4:
zio_buf_16384 16384 57643 1 53341 33786 22341 11445 30831 0 inf 7143
zio_buf_2560 2560 18636 0 17890 15244 2467 12777 12777 0 inf 12031
ffsdino2 264 540607 0 348374 28691 15875 12816 13522 0 inf 0
zfs_znode_cache 248 245152 0 206469 13015 18 12997 13015 0 inf 665
ffsino 280 540249 0 348016 30887 17156 13731 14488 0 inf 0
zio_buf_2048 2048 36944 0 36004 15617 599 15018 15026 0 inf 14259
zio_buf_1536 2048 41491 0 40737 18313 6 18307 18313 0 inf 17657
zio_buf_1024 1536 55808 0 54191 22942 357 22585 22942 0 inf 21442
dmu_buf_impl_t 216 538828 0 440673 23016 11 23005 23016 0 inf 380
arc_buf_hdr_t_f 208 657474 0 556468 25273 638 24635 25096 0 inf 7913
zio_data_buf_51 1024 187177 0 157005 45575 14127 31448 45575 0 inf 10220
vcachepl 640 266639 0 56918 34959 2 34957 34958 0 inf 1
dnode_t 640 576198 0 485522 70645 9470 61175 70645 0 inf 11511
zio_buf_512 1024 848240 0 798838 141743 15535 126208 128224 0 inf 96759
Memory resource pool statistics
Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
systcl:
kstat.zfs.misc.arcstats.size = 283598992
If I continue to do things, the system locks up and needs to have the
reset button pushed. I'm now trying an external tickle watchdog with a
srript that does sync/tickle/sleep-60, on the hopes that it will reboot
when sync starts hanging up. My memory is that this happens with
non-xen too, but it takes longer.
Other than the lockups, zfs behaves as I expect it to.
So what I don't understand is:
Is there any mechanism to cause zfs (guessing ARC) to limit the amount
of memory in use?
I there any mechanism to cause zfs to free ARC during memory pressure?
Do people think this is a xen/zfs interaction bug, that doesn't happen
in non-xen?
Basically especially with and SSD, ARC is not such a win, and ARC causing
the machine to run out of memory is dysfunctional.
questions:
Have I misconfigured/mis-used zfs?
Is there really no reclaiming under pressur?
Is there some way to limit ARC to say 1 GB?
Why isn't x% of memory a default limit, if there's no functioning
reclaim under memory pressure?
Are others having this problem?
Greg
Home |
Main Index |
Thread Index |
Old Index