NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
NPF/interface tuning? shell unusable on gateway
On same hardware, a week ago I changed my router from a different
operating system to NetBSD/amd64 9.2.
It is running a simple NAT gateway using NPF and also runs dhcpd and
unbound for internal LAN.
Periodically my shells on this new NetBSD router become unusable -- too
slow to type.
The interfaces are:
re0 is my WAN
re0 at pci2 dev 0 function 0: RealTek 8168/8111 PCIe Gigabit Ethernet
(rev. 0x03)
re0: interrupting at msix1 vec 0
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8211B 1000BASE-T media interface
re1 is my LAN
re1 at pci3 dev 1 function 0: RealTek 8169/8110 Gigabit Ethernet (rev.
0x10)
re1: interrupting at ioapic0 pin 16
re1: using 256 tx descriptors
rgephy1 at re1 phy 7: RTL8211C 1000BASE-T media interface
I can reproduce the problem by starting an rsync (over ssh) within my
LAN transferring to or from outside. I can also reproduce by running
"speedtest-cli" within my LAN.
I cannot reproduce the problem by doing the rsync or speedtest-cli
directly on the NetBSD router itself. So it appears not be the NAT nor
the WAN interface.
While my NetBSD router shell is unusable, I can still use remote SSH
shells fine. That is the part that confuses me, so over the NAT and
over the WAN is okay. Even ssh shell on the remote host rsyncing to or
from is usable while the NetBSD gateway shell is unusable (at the same
time).
There is low cpu load when I have problem.
With rsync across my gateway, if I use --bwlimit 1400k, the problem is
noticable but shell is somewhat usable. --bwlimit 1500k or faster then
shell is unusable.
I tried to watch with sysstat ifstat. It appears to hang when re1 out
(to my LAN) reaches around 10 Mbits/s to 11 Mbits/s. One time the
"systat ifstat 0.01" showed it hanged at out 10.883 Mb/s , peak:
12.196 Mb/s. (But since it hangs, it may not have updated timely.)
The shell hangs immediately when doing the rsync. When I suspend the
rsync, my shell recovers in about 10 seconds. I could reproduce this
many times.
speedtest-cli over LAN shows Download: 6.34 Mbit/s
systat ifstat 0.01 shows peak 24.312 Mb/s
another speedtest-cli run over LAN Download: 9.95 Mbit/s
systat peak 20.981 Mb/s
A speedtest-cli over the LAN using same hardware, same interfaces,
different operating system was Download: 62.72 Mbit/s but that was six
months ago, and different target "best server".
I can also get 18.816 Mb/s traffic from the gateway (not over NAT nor
WAN) to LAN and the NetBSD gateway shell is still usuable, but noticably
laggy. So 1.5 times more bandwidth. So maybe it is the NPF NAT that is
the problem.
My npf.conf is:
$ext_if = "re0"
$int_if = "re1"
$ext_addrs = { ifaddrs($ext_if) }
$localnet = { 172.16.1.0/24 }
alg "icmp"
map inet4($ext_if) dynamic $localnet -> inet4($ext_if)
group "external" on $ext_if {
pass stateful out all
block in all
}
group "internal" on $int_if {
pass final all
}
group default {
pass final on lo0 all
block all
}
I am unsure if the NPF is the problem, and maybe my interface has a
problem, but it was working fine for me to login and use the shell on
the system locally fine many times before I put NetBSD on it.
Any suggestions on tuning so my shell on the router is usable?
Here is "sysstat vmstat 0.01" when it hangs:
4 users Load 0.12 0.05 0.05 Sat Mar 26 18:31:58
Proc:r d s Csw Traps SysCal Intr Soft Fault PAGING SWAPPING
1 6 114 1193 1200 1000 in out in out
ops
14.3% Sy 0.0% Us 0.0% Ni 3.6% In 82.1% Id pages
| | | | | | | | | | |
=======%% forks
fkppw
Anon 130180 4% zero 302356 1250 Interrupts fksvm
Exec 24816 % wired 24 TLB shootdown pwait
File 1831888 61% inact 671384 100 cpu0 timer relck
Meta 409088 % bufs 89448 336 ioapic0 pin 16 rlkok
(kB) real swaponly free ioapic0 pin 18 noram
Active 1315476 331500 814 msix1 vec 0 ndcpy
Namei Sys-cache Proc-cache ioapic0 pin 23 fltcp
Calls hits % hits % ioapic0 pin 19 zfod
6 6 100 cow
512 fmin
Disks: sd0 wd0 dk0 dk1 682 ftarg
seeks itarg
xfers flnan
bytes pdfre
%busy
Any suggestions on how I can better diagnose this?
Home |
Main Index |
Thread Index |
Old Index