Subject: kern/27166: ``Invalid argument'' loading ipfilter 4.1.3 rules
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <carton@Ivy.NET>
List: netbsd-bugs
Date: 10/06/2004 17:09:07
>Number: 27166
>Category: kern
>Synopsis: ``Invalid argument'' loading ipfilter 4.1.3 rules
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Oct 06 17:10:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator: Miles Nordin
>Release: NetBSD 2.0_BETA 2004-08-15
>Organization:
Ivy Ministries
>Environment:
locally pulled up most kernel changes listed in doc/CHANGES-2.0 for 2.0_RC2
netinet/fil.c 1.61.2.7 pr#26666 t#783, t#888
kern/uipc_mbuf.c 1.80.2.4 pr#26733 t#831, and t#841
sys/mbuf.h 1.90.2.4 pr#26733 t#831, and t#839
netinet/ip_fil_netbsd.c 1.3.2.10 pr#26733 t#833
netinet6/raw_ip6.c 1.63.2.2 pr#26733 t#836
kern/kern_lock.c 1.75.2.1 t#752
lib/libkern/arc4random.c 1.11.2.1 t#824
nfs/nfs_bio.c 1.116.2.2 t#858
nfs/nfs_subs.c 1.132.2.3 t#858, t#889
nfs/nfs_var.h 1.42.2.2 t#858
nfs/nfsnode.h 1.46.2.2 t#858
ufs/ufs/ufs_bmap.c 1.28.2.2 t#859
sys/netinet/tcp_input.c 1.190.2.6 t#861
sys/netinet/tcp_subr.c 1.160.2.5 t#861
sys/netinet/tcp_var.h 1.106.2.2 t#861
System: NetBSD lucette 2.0_BETA NetBSD 2.0_BETA (LUCETTE-$Revision: 1.1 $) #4: Mon Oct 4 23:44:38 EDT 2004 carton@castrovalva:/scratch/src/sys/arch/sparc64/compile/LUCETTE sparc64
Architecture: sparc64
Machine: sparc64
>Description:
$ sudo /etc/rc.d/ipfilter reload
Reloading ipfilter rules.
380:ioctl(add/insert rule): Invalid argument
386:ioctl(add/insert rule): Invalid argument
Set 1 now inactive
$
0. Note this is not a syntax error in ipf.conf, because /sbin/ipf parsed
the file and called into the kernel to load it, and in fact even
switched to the new ruleset.
1. If I don't edit /etc/ipf.conf, the lines where it encounters the error
don't change if I 'ipfilter reload' over and over. If I stop ipnat and
ipfilter and restart them, the error still doesn't change.
2. If I add a comment to /etc/ipf.conf, the error still happens on the
<n>th rule loaded, not the <n>th line of the file.
3. If I comment out a rule above the one where the error occurred, or if
I comment out the rule that caused the error, the error still happens
on the <n>th rule loaded. AFAICT it doesn't have to do with the
specific content of the rule.
4. If I comment out large numbers of rules, the errors move around
erratically, and I can get as many as 5 errors.
5. AFAICT those rules that don't experience errors loading into the kernel
are blocking/passing traffic just fine, and they are actually loaded:
$ ( sudo ipfstat -il; sudo ipfstat -ol ) | wc -l
226
$ sed -e '/^#/d' -e '/^$/d' < /etc/ipf.conf | wc -l
228
6. Here is /etc/ipf.conf near where the error occurred:
$ awk '{ print FNR, " ", $0 }' < /etc/ipf.conf
[...]
377 pass out quick on gem1 proto icmp from 192.168.0.0/16 to any icmp-type echo keep state
378 pass out quick on tlp0 proto icmp from 192.168.0.0/16 to any icmp-type timest keep state
379 pass out quick on tlp2 proto icmp from 192.168.0.0/16 to any icmp-type timest keep state
380 pass out quick on gem1 proto icmp from 192.168.0.0/16 to any icmp-type timest keep state
381 pass out quick on tlp0 proto icmp from 192.168.0.0/16 to any icmp-type inforeq keep state
382 pass out quick on tlp2 proto icmp from 192.168.0.0/16 to any icmp-type inforeq keep state
383 pass out quick on gem1 proto icmp from 192.168.0.0/16 to any icmp-type inforeq keep state
384 pass out quick on tlp0 proto icmp from 192.168.0.0/16 to any icmp-type maskreq keep state
385 pass out quick on tlp2 proto icmp from 192.168.0.0/16 to any icmp-type maskreq keep state
386 pass out quick on gem1 proto icmp from 192.168.0.0/16 to any icmp-type maskreq keep state
387 block in quick on tlp0 proto icmp from any to 192.168.0.0/16 icmp-type echorep
388 block in quick on tlp2 proto icmp from any to 192.168.0.0/16 icmp-type echorep
389 block in quick on gem1 proto icmp from any to 192.168.0.0/16 icmp-type echorep
As you can see, the error doesn't occur on the last rules, and there are
very similar rules right after the one with the error that get loaded fine.
>How-To-Repeat:
not totally sure this problem will persist after a reboot. will ammend
the PR after rebooting, but I can't now.
system has been somewhat busy in the past, > 25,000 NAT state entries and
another 'keep state' for each of those.
>Fix:
workaround is to delete blocks of obsolete rules, move order-independent
rules around, until the error occurs on a rule I don't care about much.
>Release-Note:
>Audit-Trail:
>Unformatted: