| Can you explain how it was broken and what do you to make it work again?
I turned off logging completely to fix it. Logging ended up taking
up all the I/O cycles because each time logging overflowed syslogd
ended up logging that logging overflowed... This worked just fine
before the changes.
| Which is why we need a better solution than what we have.
| dynamically increasing/decreasing buffer size is a good solution for
| this, which should make everyone happy.
That will never fix the problem; in fact it will make the situation
worse because of bufferbloat, resource consumption on low resource
sysrems, and increased latency. As people have explained numerous
times before this is UDP and you should be prepared to lose packets
(the transport is unreliable). If you want to build a reliable
transport on top of UDP rerror is not enough, you need to use a
packet sequence number or something to detect lost packets.
Yes it is good to detect lost packets when you can so rerror is
generally a good thing, and if it was done on day one it would
probably be fine to keep. I would also be nice to have on by default
eventually, but right now it makes the situation worse than before.
| > Nevertheless now everyone can have it the way the like... There is
| > a sysctl to turn it on globally and a per-socket setsockopt to override.
|
| And we want a secure system where a lot of useful programs don't run and
| sweeps overflow issues under the carpet by default? Not me!
Yes, for the programs that want this behavior. Let us not forget that
this started because of the aberrant behavior of the routing socket
where because of the compatibility messages we ended up overflowing
and losing. Instead of fixing the root cause (don't send compat
stuff to the programs that don't need them -- programs understand only
one version of the messages and throw away the rest), we decided to
detect the dropped packet problem by introducing so_rerror. This
detection could have also be done by using the sequence number, or
a similar id based protocol.