Subject: kern/33490: panic:trap in network code during m_reclaim
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <kardel@netbsd.org>
List: netbsd-bugs
Date: 05/16/2006 08:25:00
>Number: 33490
>Category: kern
>Synopsis: panic:trap in network code during m_reclaim
>Confidential: yes
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue May 16 08:25:00 +0000 2006
>Originator: kardel@netbsd.org
>Release: NetBSD 3.0
>Organization:
>Environment:
System: NetBSD xxx 3.0 NetBSD 3.0 (XXX_ISDN) #3: Mon May 15 14:00:12 CEST 2006 kardel@Orcus:/usr/obj/sys/arch/i386/compile.i386/XXX_ISDN i386
Architecture: i386
Machine: i386
>Description:
I am seeing very frequent panics due to problems
in m_reclaim. The machine is accessible from the internet and
this may be the reason why the problem creeps up now. thus
currently marked confidential (DoS).
It is stock 3.0.
Stack trace:
(gdb) where
#0 0x03f00000 in ?? ()
#1 0xc0430cbb in cpu_reboot (howto=256, bootstr=0x0)
at /usr/src/sys/arch/i386/i386/machdep.c:751
#2 0xc038c644 in panic (fmt=0xc073ee1c "trap")
at /usr/src/sys/kern/subr_prf.c:242
#3 0xc043b0a5 in trap (frame=0xc6df3a2c)
at /usr/src/sys/arch/i386/i386/trap.c:336
#4 0xc0102cc7 in calltrap ()
#5 0xc0127ab1 in tcp_drain () at /usr/src/sys/netinet/tcp_subr.c:1287
#6 0xc039ef56 in m_reclaim (arg=0x0, flags=0)
at /usr/src/sys/kern/uipc_mbuf.c:385
#7 0xc038c32c in pool_allocator_alloc (org=0xc0867640, flags=0)
at /usr/src/sys/kern/subr_pool.c:2194
#8 0xc038ae49 in pool_get (pp=0xc0867640, flags=0)
at /usr/src/sys/kern/subr_pool.c:899
#9 0xc038be8e in pool_cache_get_paddr (pc=0xc08672c0, flags=0, pap=0xc1335654)
at /usr/src/sys/kern/subr_pool.c:1936
#10 0xc0249445 in ex_add_rxbuf (sc=0xc0b35000, rxd=0xc0b35b58)
at /usr/src/sys/dev/ic/elinkxl.c:1782
#11 0xc024842c in ex_intr (arg=0xc0b35000)
at /usr/src/sys/dev/ic/elinkxl.c:1297
Additional observations:
The system needs increasingly more and more mbufs
before crashing. Current mbuftrace output gives:
789 mbufs in use:
747 mbufs allocated to data
40 mbufs allocated to packet headers
2 mbufs allocated to socket names and addresses
4 calls to protocol drain routines
small ext cluster
route inuse 0 0 0
claims 15840 0 0
releases 15840 0 0
arp inuse 0 0 0
claims 591 514 514
releases 591 514 514
unix inuse 0 0 0
claims 2330 6 6
releases 2330 6 6
internet6 inuse 0 0 0
claims 5283 0 0
releases 5283 0 0
tcp inuse 7 0 0
claims 536 0 0
releases 529 0 0
tcp rx inuse 0 0 0
claims 7955 7682 7682
releases 7955 7682 7682
tcp tx inuse 1 0 0
claims 16922 258 258
releases 16921 258 258
udp inuse 0 0 0
claims 20902 0 0
releases 20902 0 0
small ext cluster
udp rx inuse 0 0 0
claims 55516 29779 29779
releases 55516 29779 29779
udp tx inuse 0 0 0
claims 17144 716 716
releases 17144 716 716
internet rx inuse 0 0 0
claims 25687 25003 25003
releases 25687 25003 25003
internet tx inuse 588 568 568
claims 24274 866 866
releases 23686 298 298
lo0 inuse 0 0 0
claims 498 0 0
releases 498 0 0
ex0 rx inuse 0 0 0
claims 25517 25517 25517
releases 25517 25517 25517
ex0 tx inuse 0 0 0
claims 29349 355 355
releases 29349 355 355
nfs inuse 2 0 0
claims 48 0 0
releases 46 0 0
small ext cluster
unknown data inuse 158 128 128
claims 81621 25779 25779
releases 81463 25651 25651
unknown header inuse 33 0 0
claims 9607 0 0
releases 9574 0 0
unknown soname inuse 0 0 0
claims 42299 0 0
releases 42299 0 0
unknown soopts inuse 0 0 0
claims 4659 0 0
releases 4659 0 0
unknown control inuse 0 0 0
claims 11084 0 0
releases 11084 0 0
The increasing number is in "internet tx".
A crashed kernel had following statistics.
netstat -M netbsd.0.core -mssv
3212 mbufs in use:
1089 mbufs allocated to data
2105 mbufs allocated to packet headers
18 mbufs allocated to socket names and addresses
1000/1000 mapped pages in use
3096 Kbytes allocated to network (99% in use)
0 requests for memory denied
0 requests for memory delayed
18128 calls to protocol drain routines
A crashed kernel had following netstat -s output:
ip:
135288 total packets received
0 bad header checksums
0 with size smaller than minimum
0 with data size < data length
0 with length > max ip packet size
0 with header length < data size
0 with data length < header length
0 with bad options
0 with incorrect version number
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped (out of ipqent)
0 malformed fragments dropped
0 fragments dropped after timeout
0 packets reassembled ok
135288 packets for this host
0 packets for unknown/unsupported protocol
0 packets forwarded (0 packets fast forwarded)
0 packets not forwardable
0 redirects sent
0 packets no matching gif found
150331 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
0 output datagrams fragmented
0 fragments created
0 datagrams that can't be fragmented
0 datagrams with bad address in header
icmp:
2 calls to icmp_error
0 errors not generated because old message was icmp
Output histogram:
echo reply: 230
destination unreachable: 2
1 message with bad code fields
0 messages < minimum length
0 bad checksums
4 messages with bad length
Input histogram:
destination unreachable: 322
echo: 230
router advertisement: 12
time exceeded: 3
230 message responses generated
0 path MTU changes
igmp:
699 messages received
0 messages received with too few bytes
0 messages received with bad checksum
51 membership queries received
0 membership queries received with invalid field(s)
0 membership reports received
0 membership reports received with invalid field(s)
0 membership reports received for groups to which we belong
0 membership reports sent
tcp:
76880 packets sent
19412 data packets (906819 bytes)
33 data packets (885 bytes) retransmitted
34493 ack-only packets (26975 delayed)
0 URG only packets
0 window probe packets
2 window update packets
22949 control packets
0 send attempts resulted in self-quench
72967 packets received
42296 acks (for 929613 bytes)
7638 duplicate acks
0 acks for unsent data
26980 packets (896367 bytes) received in-sequence
0 completely duplicate packets (0 bytes)
0 old duplicate packets
0 packets with some dup. data (0 bytes duped)
7638 out-of-order packets (0 bytes)
0 packets (0 bytes) of data after window
0 window probes
0 window update packets
0 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
7648 connection requests
7650 connection accepts
15293 connections established (including accepts)
13238 connections closed (including 2 drops)
3 embryonic connections dropped
0 delayed frees of tcpcb
42288 segments updated rtt (of 42294 attempts)
46 retransmit timeouts
2 connections dropped by rexmit timeout
0 persist timeouts (resulting in 0 dropped connections)
5 keepalive timeouts
0 keepalive probes sent
3 connections dropped by keepalive
75 correct ACK header predictions
7727 correct data packet header predictions
15300 PCB hash misses
0 dropped due to no socket
0 connections drained due to memory shortage
7 PMTUD blackholes detected
0 bad connection attempts
7650 SYN cache entries added
0 hash collisions
7650 completed
0 aborted (no space to build PCB)
0 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
0 dropped due to RST
0 dropped due to ICMP unreachable
0 delayed free of SYN cache entries
0 SYN,ACKs retransmitted
0 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
0 packets with bad signature
0 packets with good signature
udp:
61055 datagrams received
0 with incomplete header
0 with bad data length field
0 with bad checksum
2 dropped due to no socket
4555 broadcast/multicast datagrams dropped due to no socket
0 dropped due to full socket buffers
56498 delivered
36743 PCB hash misses
73209 datagrams output
ip6:
2 total packets received
0 with size smaller than minimum
0 with data size < data length
0 with bad options
0 with incorrect version number
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 fragments that exceeded limit
0 packets reassembled ok
0 packets for this host
0 packets forwarded
0 packets not forwardable
0 redirects sent
38 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
0 output datagrams fragmented
0 fragments created
0 datagrams that can't be fragmented
0 packets that violated scope rules
0 multicast packets which we don't join
Input packet histogram:
hop by hop: 2
Mbuf statistics:
0 one mbufs
2 one ext mbufs
0 two or more ext mbufs
0 packets whose headers are not continuous
0 tunneling packets that can't find gif
0 packets discarded due to too many headers
0 failures of source address selection
source addresses on an outgoing I/F
3 link-locals
source addresses of same scope
3 link-locals
0 forward cache hit
0 forward cache miss
icmp6:
0 calls to icmp6_error
0 errors not generated because old message was icmp6 or so
0 errors not generated because of rate limitation
Output packet histogram:
multicast listener report: 34
router solicitation: 3
neighbor solicitation: 1
0 messages with bad code fields
0 messages < minimum length
0 bad checksums
0 messages with bad length
Input packet histogram:
multicast listener report: 2
Histogram of error messages to be generated:
0 no route
0 administratively prohibited
0 beyond scope
0 address unreachable
0 port unreachable
0 packet too big
0 time exceed transit
0 time exceed reassembly
0 erroneous header field
0 unrecognized next header
0 unrecognized option
0 redirect
0 unknown
0 message responses generated
0 messages with too many ND options
0 messages with bad ND options
0 bad neighbor solicitation messages
0 bad neighbor advertisement messages
0 bad router solicitation messages
0 bad router advertisement messages
0 bad redirect messages
0 path MTU changes
arp:
244 packets sent
61 reply packets
183 request packets
951 packets received
26 reply packets
925 valid request packets
916 broadcast/multicast packets
0 packets with unknown protocol type
0 packets with bad (short) length
0 packets with null target IP address
0 packets with null source IP address
0 could not be mapped to an interface
0 packets sourced from a local hardware address
0 packets with a broadcast source hardware address
0 duplicates for a local IP address
0 attempts to overwrite a static entry
0 packets received on wrong interface
0 entrys overwritten
0 changes in hardware address length
200 packets deferred pending ARP resolution
9 sent
186 dropped
0 failures to allocate llinfo
A core and netbsd.gdb can be made available on request.
>How-To-Repeat:
Run 3.0 and add in some factors: elinkxl, internet, connected
to managed switch, I don't really know
>Fix:
Find reason why pakets remain in "internet tx" mbuf allocations.
>Unformatted: