Subject: kern/11201: Missing ICMP redirects
To: None <gnats-bugs@gnats.netbsd.org>
From: Wolfgang Solfrank <ws@tools.de>
List: netbsd-bugs
Date: 10/12/2000 10:00:19
>Number: 11201
>Category: kern
>Synopsis: Missing ICMP redirects
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 12 10:00:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Wolfgang Solfrank
>Release: NetBSD-1.3.2
>Organization:
NetBSD Hackers
>Environment:
System: NetBSD adsl 1.3.2 (adsl) #33: Thu Oct 12 16:21:30 MEST 2000 ws@adsl:/files/NetBSD/src/sys/arch/i386/compile/adsl i386
Architecture: i386
>Description:
On little-endian machines, the networking code doesn't always
send ICMP redirects.
The sequence of events is as follows:
The IP packet is pulled off the input queue (in ipintr).
Immediately after this, the fields in the IP header are converted
to host byte order.
After determining that the packet needs to be forwarded, it is
handed to ip_forward, which quite early makes a copy of the start
of the packet, which will eventually be used for an ICMP message.
Now this copy process handles mbufs with M_EXT set by just pointing
to the same external storage as the original mbuf (this is the
culprit of the problem, see below).
After that there is a check for whether to send an ICMP redirect,
the outcome of which is remembered until later (i.e. the ICMP
message isn't sent right here).
Then the packet is handed to ip_output, which (among other things)
converts the fields in the IP header back to network byte order.
It's only after the return from ip_output that eventually the
function icmp_error gets called in order to actually send the
ICMP redirect determined above.
Now remember that if the original packet was in a cluster (or
otherwise external storage), the storage for the packet data
handed to icmp_error is the same as that handed to ip_output
previously. Since ip_output swapped the bytes in the IP header,
the header in this case isn't in the byte order which icmp_error
expects!
icmp_error tries to avoid sending messages in response to packets
other than the first fragment of a larger sequence. Therefore it
checks the offset field in the IP header, of course masking off the
"don't fragment" and "more fragments" bits. However, since the
field is in the wrong byteorder, it masks off the wrong bits.
Especially if the original packet had the "don't fragment" bit
set (which is quite common nowadays), the code in icmp_error
incorrectly determines that the packet isn't the first fragment
and happily discards it.
Note that I actually analyzed this on a 1.3.2 box (don't ask)
but it seems that the problem is still there with 1.5H. Can
anyone please confirm or deny that the problem still persists
with -current?
>How-To-Repeat:
Install a little-endian NetBSD box as a router on a net which also
has other routers, cause a host which uses PMTU discovery to send
packets to the NetBSD box and watch the traffic on the wire. Be
surprised that you don't see any ICMP redirects.
>Fix:
The fix is to avoid pointing to the same storage when doing the
copying of the packet. Since the amount of data to copy for the
ICMP packet is small, there is no need to have external storage
for the copy. So it should be ok to just copy the part of the
data to be copied into the mbuf instead of referencing the
external storage (note that the diff below is against -current
as of 10/12/00):
*** uipc_mbuf.c.old Thu Oct 12 18:39:23 2000
--- uipc_mbuf.c Thu Oct 12 18:54:20 2000
***************
*** 424,429 ****
--- 424,430 ----
{
struct mbuf *n, **np;
int off = off0;
+ int blen;
struct mbuf *top;
int copyhdr = 0;
***************
*** 451,456 ****
--- 452,458 ----
*np = n;
if (n == 0)
goto nospace;
+ blen = MLEN;
if (copyhdr) {
M_COPY_PKTHDR(n, m);
if (len == M_COPYALL)
***************
*** 458,466 ****
else
n->m_pkthdr.len = len;
copyhdr = 0;
}
n->m_len = min(len, m->m_len - off);
! if (m->m_flags & M_EXT) {
if (!deep) {
n->m_data = m->m_data + off;
n->m_ext = m->m_ext;
--- 460,473 ----
else
n->m_pkthdr.len = len;
copyhdr = 0;
+ blen = MHLEN;
}
n->m_len = min(len, m->m_len - off);
! if (n->m_len > blen) {
! #ifdef DIAGNOSTIC
! if (!(m->m_flags & M_EXT)) {
! panic("m_copym: m->m_len too large for !M_EXT");
! #endif
if (!deep) {
n->m_data = m->m_data + off;
n->m_ext = m->m_ext;
BTW, shouldn't the m_copym and m_dup functions be changed to
macros?
>Release-Note:
>Audit-Trail:
>Unformatted: