Subject: Why do some network connections get stuck forever?
To: NetBSD tech-net list <tech-net@NetBSD.org>
From: Greg A. Woods <woods@weird.com>
List: tech-net
Date: 10/24/2005 14:56:32
Why do some network connections get stuck forever? Every time something
"bad" happens to the network for any extended period of time (eg. this
time I think it was the switch being rebooted and not coming back up for
quite a few minutes, or similar), some (and sometimes many) existing
(and always inbound) connections get stuck forever until the server
processes that hold them open are killed.
I've seen these happen for any and every kind of TCP service, including
standard NetBSD services, such as ftpd, telnetd, etc. In this example
it's a pop3d process from Cyrus IMAPd.
This problem has been present on servers I've been responsible for since
way back in the NetBSD-1.3 days on every kind of platform I've run,
though now with 1.6.x when I kill the processes the whole system is not
likely to soon panic as a result (and indeed the processes do die and
the sockets do seem to get closed properly, though perhaps there are
still some small memory leaks).
This particular process has been hung for over a month, and as you'll
see below the client that started it isn't even on the net at the moment.
14:04 [10] $ ps -up 18287
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
cyrus 18287 0.0 -305.9 3688 9112 ?? I 19Sep05 0:03.66 pop3d: pop3d: xtreme-12-108.dyn.aci.on.ca [69.17.171.108]
14:04 [11] $ ps -lp 18287
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
120 18287 235 0 2 0 3688 9112 netio I ?? 0:03.66 pop3d: pop3d: xtreme-12-108.dyn.aci.on.ca [69.17.171.108]
14:04 [12] $ fstat -p 18287
USER CMD PID FD MOUNT INUM MODE SZ|DV R/W
cyrus pop3d 18287 wd /var/spool/imap 30402863 drwx------ 512 r
cyrus pop3d 18287 0* internet stream tcp fffffc0118f36498 205.207.148.251:995 <-> 69.17.171.108:2068
cyrus pop3d 18287 1* internet stream tcp fffffc0118f36498 205.207.148.251:995 <-> 69.17.171.108:2068
cyrus pop3d 18287 2* internet stream tcp fffffc0118f36498 205.207.148.251:995 <-> 69.17.171.108:2068
cyrus pop3d 18287 3* pipe 0xfffffc0086f7cc08 -> 0xfffffc0086f7caf0 w
cyrus pop3d 18287 4* internet stream tcp fffffc00a8b94018 *:995
cyrus pop3d 18287 5* unix dgram fffffe00007efa80 <-> fffffe0000a0ae80
cyrus pop3d 18287 6 /var 1200842 -rw------- 1018832 rw
cyrus pop3d 18287 7 /var 1200789 -rw------- 1711532 rw
cyrus pop3d 18287 8* unix dgram fffffe00006fd500
cyrus pop3d 18287 9 /var 1200802 -rw------- 0 rw
cyrus pop3d 18287 10 /var 1200834 -rw------- 44 rw
cyrus pop3d 18287 11 /var 1200778 -rw------- 7946240 rw
cyrus pop3d 18287 12 /var 1200803 -rw------- 25296896 rw
14:04 [13] $ netstat -nA |fgrep fffffc0118f36498
fffffc0118f36498 tcp 0 0 205.207.148.251.995 69.17.171.108.2068 ESTABL
14:04 [13] $ /sbin/ping 69.17.171.108
PING xtreme-12-108.dyn.aci.on.ca (69.17.171.108): 48 data bytes
92 bytes from 3500-1.aci.on.ca (205.207.148.6): Destination Host Unreachable for icmp_seq=0
92 bytes from 3500-1.aci.on.ca (205.207.148.6): Destination Host Unreachable for icmp_seq=1
92 bytes from 3500-1.aci.on.ca (205.207.148.6): Destination Host Unreachable for icmp_seq=2
^?
----xtreme-12-108.dyn.aci.on.ca PING Statistics----
5 packets transmitted, 0 packets received, 100.0% packet loss
--
Greg A. Woods
H:+1 416 218-0098 W:+1 416 489-5852 x122 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>