Subject: Re: bridge(4) and silent data corruption :-(
To: None <dennis@juniper.net, smd@ab.use.net>
From: Sean Doran <smd@ab.use.net>
List: tech-net
Date: 05/02/2002 12:37:56
Dennis -
Ok, the theory that there are bad frames being dropped onto the
network by the efficient networks router is testable enough; it
would be manifesting itself as "netstat -i" Ierr counts on the
hosts not behind the bridge, right?
I do not see this.
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 <Link> xx.xx.xx.xx.xx.xx 11004274 0 34729980 0 0
en0 1500 xxx.xxx.xxx.xxx xxx.xxx.xxx.xxx 11004274 0 34729980 0 0
...
en0 1500 <Link> xx.xx.xx.xx.xx.xx 11011020 0 34754782 0 0
en0 1500 xxx.xxx.xxx.xxx xxx.xxx.xxx.xxx 11011020 0 34754782 0 0
this is almost exclusively traffic which has transited the router
| The reason this is a problem through the bridge is that putting the
| interfaces on the NetBSD box in promiscuous mode causes packet CRC errors
| to be ignored (I have no idea if this is so, but I do know that allowing
| tcpdump and other listeners to receive errored packets is useful and this
| was the only application promiscuous mode was used for prior to the bridge
| code).
well, this is a good thing to check up on, even if it's not
what's causing the symptoms.
(i always thought promiscuous mode was more interesting for getting
at packets not addressed to my MAC address or a multicast group
i'm listening to, with bad packets being a side-effect. no?)
| I
| think there are other ways to end up with the same problem too, this is
| just the first one that came to mind.
Well, you're doing better than me :-)
Incidentally,
| Cheap DSL routers can have this property.
Efficient Networks routers seem highly rated.
I am about to liberate a small Cisco router from Denmark for
comparison purposes on Sunday night, assuming I can get it to
talk to BT OpenWorld.
| so if there is no NAT box
Nope.
| this has the makings of being a very interesting failure if
| you can figure out what it is.
Interesting failures are the ones which remind us we're alive. :-)
| Note that you'll want to be careful with this arrangement since the failures
| you get running with ssh will turn into undetected data corruption without
| ssh.
Yes, it was making a big mess with an application which grabs
a 9MB chunk of data at a time, then md4s it to see if it
arrived OK or needs to be re-requested. Most were not OK.
Many many packets wasted. Blah.
Sean.