network interface transmit queue overflow handling

To: tech-net%netbsd.org@localhost
Subject: network interface transmit queue overflow handling
From: Erik Fair <fair%netbsd.org@localhost>
Date: Sat, 7 May 2022 10:59:56 -0700

> On Feb 11, 2022, at 03:23, Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:
> 
> On Fri, 11 Feb 2022, Michael van Elst wrote:
>> netbsd%precedence.co.uk@localhost (Stephen Borrill) writes:
>> 
>>> As an aside, I have seen similar messages when trying to do a UDP
>>> bandwidth test with netio.
>> 
>> That's regular BSD behaviour. When you send packets faster than
>> you can emit on the wire, buffers of any size run full and you
>> get ENOBUFS.
>> 
>> On other systems, such packets may be silently dropped (they could
>> be dropped silently on the network path anyway).
>> 
>> The correct way to handle this is to rate limit UDP packets in the
>> application or even to implement some kind of flow control in the
>> application.
>> 
>> Squid implements 'delay pools' for rate limiting, I'm not sure
>> if that also applies to the UDP traffic between caches.
>> 
>> https://wiki.squid-cache.org/Features/DelayPools
>> 
>> Another way might be to move away from ICP (UDP based) to other
>> cache protocols.
>> 
>> Squid can also do logging via UDP, the configuration there seems
>> to have its own "buffer-size" option.
> 
> This isn't to do with neighbour caches, the messages suggest it is DNS:
> 
> comm_udp_sendto FD 22, (family=2) 127.0.0.1:53: (55) No buffer space available
> idnsSendQuery FD 22: sendto: (55) No buffer space available
> 
> I think this is consistent with the symptoms I had reported to me (in a rather vague way) as I could not get an established connection (e.g. playing a YouTube video) to go wrong, but users were reporting "intermittent internet".
> 
> -- 
> Stephen

Frankly, the NetBSD kernel does two things wrong when a network interface transmit queue is full:

1. it returns ENOBUFS, which is not the same as actually running out of mbufs; we should have a separate EQFULL error for this to make the actual nature of the error perfectly clear.

2. given that applications have very little way to properly interrogate network congestion state & respond to it (ENOBUFS certainly doesn’t tell the app how much/long to back off which is the key question), and Ethernet switches flow control everyone these days anyway, the default kernel behavior should be: block a process trying to transmit which would overflow a network interface transmit queue, rather than error out, unless the application has explicitly asked for asynchronous I/O on that descriptor (which is a declaration that it’s prepared to handle such errors).

Yes, this is a change to API. I think it’s a better way to handle the common case of host connected to modern Ethernet switch, and certainly easier for the programmers of userland software.

	Erik

References:
- squid on 9.2: No buffer space available
  - From: Stephen Borrill
- Re: squid on 9.2: No buffer space available
  - From: Michael van Elst
- Re: squid on 9.2: No buffer space available
  - From: Stephen Borrill

Prev by Date: Re: mbuf cluster leak?
Next by Date: Re: mbuf cluster leak?
Previous by Thread: Re: squid on 9.2: No buffer space available
Next by Thread: Checking mbuf cluster usage
Indexes:

Home | Main Index | Thread Index | Old Index