Subject: bin/11586: ping looks to have a slight problem...
To: None <gnats-bugs@gnats.netbsd.org>
From: None <kre@munnari.OZ.AU>
List: netbsd-bugs
Date: 11/28/2000 21:42:29
>Number:         11586
>Category:       bin
>Synopsis:       The RTT reported by ping can become absurd...
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 28 21:41:01 PST 2000
>Closed-Date:
>Last-Modified:
>Originator:     Robert Elz
>Release:        <NetBSD-current source date>
>Organization:
	University of Melbourne
>Environment:
	
System: NetBSD brandenburg.cs.mu.OZ.AU 1.5_BETA NetBSD 1.5_BETA (GATEWAY) #2: Thu Oct 26 05:55:17 EST 2000 root@:/usr/src/sys/arch/i386/compile/GATEWAY i386


>Description:
	I just ran ping, here are the final lines...

64 bytes from 128.250.1.21: icmp_seq=1791 ttl=230 time=287853.054 ms
64 bytes from 128.250.1.21: icmp_seq=1789 ttl=230 time=289893.928 ms
64 bytes from 128.250.1.21: icmp_seq=1790 ttl=230 time=288918.238 ms
64 bytes from 128.250.1.21: icmp_seq=1792 ttl=230 time=286945.082 ms

----128.250.1.21 PING Statistics----
1793 packets transmitted, 1463 packets received, 18.4% packet loss
round-trip min/avg/max/stddev = 1575.600/127686.745/289893.928/78303.768 ms

	To me, that is just a fraction unbelievable....

	The command used was:	ping -n 128.250.1.21
	so there are no DNS lookups involved here making
	a mockery of pin's timimg attempts.

	Other information ...

	I killed the ping at this point ...

64 bytes from 128.250.1.21: icmp_seq=1570 ttl=230 time=218068.221 ms
64 bytes from 128.250.1.21: icmp_seq=1571 ttl=230 time=218615.042 ms
^C64 bytes from 128.250.1.21: icmp_seq=1573 ttl=230 time=219745.574 ms
64 bytes from 128.250.1.21: icmp_seq=1576 ttl=230 time=220962.581 ms
64 bytes from 128.250.1.21: icmp_seq=1579 ttl=230 time=222115.314 ms

You can see the echo of the ^C in there.   Then ping's idea of the RTT
was 220 secs approx - note it waited it it had received another 220
(approx) packets before giving up.   After it stopped printing the
"64 bytes from ..." lines, it waited another good long time (I don't
know if it was another 220 seconds, didn't seem like quite that long
though) before it exited (the "wait for lost packets to arrive late"
pause).

Note the "18.4% packet loss" is "normal", that isn't the issue, just
the RTT measurement.

While the ping was winding down (after the ^C, before it stopped
receiving its packets) I ran a traceroute (with -n of course) ...

traceroute to 128.250.1.21 (128.250.1.21), 30 hops max, 40 byte packets
 1  203.154.130.1  0.971 ms  0.739 ms  0.951 ms
 2  192.100.77.18  4.087 ms  2.605 ms  2.177 ms
 3  202.28.18.169  6.780 ms  6.828 ms  6.762 ms
 4  202.28.18.161  37.660 ms  26.747 ms  39.690 ms
 5  64.86.84.29  770.387 ms  713.910 ms  732.145 ms
 6  207.45.222.109  1162.401 ms  1184.250 ms  1133.021 ms
 7  207.45.222.85  1186.203 ms  1217.410 ms  1207.716 ms
 8  207.45.223.73  1213.429 ms  1239.606 ms  1235.587 ms
 9  207.45.222.230  1283.631 ms  1219.985 ms  1241.965 ms
10  207.45.220.101  1250.582 ms  1222.222 ms  1254.336 ms
11  * 192.205.32.89  1248.159 ms  1254.274 ms
12  12.123.28.114  1256.894 ms *  1187.120 ms
13  12.122.1.173  1251.384 ms *  1270.861 ms
14  * 12.122.2.69  1286.020 ms *
15  12.122.5.249  1238.643 ms  1249.060 ms *
16  12.127.10.250  1302.424 ms  1265.441 ms  1248.751 ms
17  * 205.174.74.166  1326.538 ms  1297.780 ms
18  192.65.89.65  1579.912 ms  1638.274 ms  1665.124 ms
19  202.139.190.19  1632.485 ms  1618.745 ms  1605.048 ms
20  192.65.88.194  1616.254 ms *  1582.445 ms
21  203.21.130.39  1680.743 ms  1687.836 ms  1655.552 ms
22  128.250.3.16  1633.615 ms  1595.066 ms  1631.272 ms
23  * 128.250.29.2  1702.265 ms  1612.554 ms
24  128.250.1.21  1661.181 ms *  1676.563 ms

You can see the packet loss there, and the "normal" RTT (which is
what ping normally returns as well).

Also, I was typing over the same link, with more or less normal
echo, response, etc.

I wasn't watching all the time to see where the RTT climb began, or
where, as far as the scrollback in that window goes is ...

64 bytes from 128.250.1.21: icmp_seq=518 ttl=230 time=48775.763 ms
64 bytes from 128.250.1.21: icmp_seq=519 ttl=230 time=47776.338 ms
64 bytes from 128.250.1.21: icmp_seq=520 ttl=230 time=47688.205 ms
64 bytes from 128.250.1.21: icmp_seq=522 ttl=230 time=45701.826 ms
64 bytes from 128.250.1.21: icmp_seq=525 ttl=230 time=42704.943 ms
64 bytes from 128.250.1.21: icmp_seq=524 ttl=230 time=43767.530 ms
64 bytes from 128.250.1.21: icmp_seq=526 ttl=230 time=42707.727 ms
64 bytes from 128.250.1.21: icmp_seq=527 ttl=230 time=42230.617 ms
64 bytes from 128.250.1.21: icmp_seq=529 ttl=230 time=43624.564 ms
64 bytes from 128.250.1.21: icmp_seq=530 ttl=230 time=44061.704 ms
64 bytes from 128.250.1.21: icmp_seq=531 ttl=230 time=44649.148 ms

Most of the time packets seemed (when I was watching) to be appearing
every second, but every now and again a short burst would appear
hard against each other (the first three packets of that last output
might be one such case).

Note also the packet re-ordering in that last sample (525 before 524).
I don't know how much of that there was (maybe ping could report how
many packets received out of order?)


>How-To-Repeat:
	Not a clue.   Most times ping seems to work for me...
	Just perhaps receiving packets out of order is a trigger??

>Fix:
	Even less of a clue... (though I suspect identifying the
	cause, and hence the :how to repeat" will provide the fix).
>Release-Note:
>Audit-Trail:
>Unformatted: