NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/59053: bizarre interaction between poll and sendmsg with cmsg fd passing
>Number: 59053
>Category: kern
>Synopsis: bizarre interaction between poll and sendmsg with cmsg fd passing
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Feb 07 16:05:00 +0000 2025
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The NetCMSG Pollination
>Environment:
>Description:
How many file descriptors can you send over a socket with sendmsg(2) and cmsg(3)? What happens when you have sent too many? How do you know when you can send more?
sendmsg(2) will fail with EAGAIN/EWOULDBLOCK if there are too many file descriptors in flight:
1562 nfds = (cm->cmsg_len - CMSG_ALIGN(sizeof(*cm))) / sizeof(int);
1563 fdp = (int *)CMSG_DATA(cm);
1564 maxmsg = maxfiles / unp_rights_ratio;
1565 for (i = 0; i < nfds; i++) {
1566 fd = *fdp++;
1567 if (atomic_inc_uint_nv(&unp_rights) > maxmsg) {
1568 atomic_dec_uint(&unp_rights);
1569 nfds = i;
1570 error = SET_ERROR(EAGAIN);
1571 goto out;
https://nxr.netbsd.org/xref/src/sys/kern/uipc_usrreq.c?r=1.207#1570
But poll(2) will report POLLOUT for the file descriptor irrespective of whether this limit might be hit and sendmsg(2) might fail with EAGAIN because of it:
2434 if (events & (POLLOUT | POLLWRNORM))
2435 if (sowritable(so))
2436 revents |= events & (POLLOUT | POLLWRNORM);
https://nxr.netbsd.org/xref/src/sys/kern/uipc_socket.c?r=1.313#2434
460 /* can we write something to so? */
461 static __inline int
462 sowritable(const struct socket *so)
463 {
464
465 KASSERT(solocked(so));
466
467 return (sbspace(&so->so_snd) >= so->so_snd.sb_lowat &&
468 ((so->so_state & SS_ISCONNECTED) != 0 ||
469 (so->so_proto->pr_flags & PR_CONNREQUIRED) == 0)) ||
470 (so->so_state & SS_CANTSENDMORE) != 0 ||
471 so->so_error != 0;
472 }
https://nxr.netbsd.org/xref/src/sys/sys/socketvar.h?r=1.170#460
This confuses some applications (notably devel/capnproto, async-io-test.c++ AsyncIo/CapabilityPipeBlockedSendStream) into entering an infinite loop because poll says ready for POLLOUT but sendmsg fails with EAGAIN.
>How-To-Repeat:
1. run devel/capnproto tests
2. smaller reduced test case here:
$ cat sendblock.c
#include <sys/socket.h>
#include <err.h>
#include <errno.h>
#include <poll.h>
#include <string.h>
#include <unistd.h>
int
main(void)
{
int s[2];
int nfds = 0;
if (socketpair(AF_LOCAL, SOCK_STREAM|SOCK_NONBLOCK, 0, s) == -1)
err(1, "socketpair");
for (;;) {
int t[2];
union {
struct cmsghdr hdr;
char buf[CMSG_SPACE(sizeof(int))];
} cmsgbuf;
struct cmsghdr *cmsg;
struct msghdr msg;
ssize_t len;
struct pollfd pfd;
int i, n;
memset(&pfd, 0, sizeof(pfd));
pfd.fd = s[0];
pfd.events = POLLOUT;
if ((n = poll(&pfd, 1, 0)) == -1)
err(1, "poll");
if (n == 0) {
warnx("poll returned 0");
pfd.fd = s[0];
pfd.events = pfd.revents = 0;
n = 1;
} else if (n != 1) {
errx(1, "poll returned %d", n);
} else {
warnx("poll returned %d, revents=%d", n, pfd.revents);
}
if (socketpair(AF_LOCAL, SOCK_STREAM|SOCK_NONBLOCK, 0, t)
== -1)
err(1, "socketpair");
msg = (struct msghdr) {
.msg_name = NULL,
.msg_namelen = 0,
.msg_iov = NULL,
.msg_iovlen = 0,
.msg_control = cmsgbuf.buf,
.msg_controllen = sizeof(cmsgbuf),
.msg_flags = 0,
};
cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_len = CMSG_LEN(sizeof(int));
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
*(int *)CMSG_DATA(cmsg) = t[0];
len = sendmsg(s[0], &msg, 0);
if (len == -1) {
if (errno != EAGAIN)
err(1, "sendmsg");
if ((pfd.revents & POLLOUT) == 0)
break;
err(1, "poll said POLLOUT ready but sendmsg failed");
}
if ((pfd.revents & POLLOUT) == 0)
warnx("sendmsg succeeded despite no POLLOUT");
nfds++;
warnx("%d descriptor%s sent", nfds, nfds == 1 ? "" : "s");
if (close(t[0]) == -1)
err(1, "close");
if (close(t[1]) == -1)
err(1, "close");
}
return 0;
}
$ make sendblock
cc -O2 -o sendblock sendblock.c
$ ./sendblock
sendblock: poll returned 1, revents=4
sendblock: 1 descriptor sent
sendblock: poll returned 1, revents=4
sendblock: 2 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 3 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 4 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 5 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 6 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 7 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 8 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 9 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 10 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 11 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 12 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 13 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: 14 descriptors sent
sendblock: poll returned 0
sendblock: sendmsg succeeded despite no POLLOUT
sendblock: 15 descriptors sent
sendblock: poll returned 0
sendblock: sendmsg succeeded despite no POLLOUT
sendblock: 16 descriptors sent
sendblock: poll returned 1, revents=4
sendblock: poll said POLLOUT ready but sendmsg failed: Resource temporarily unavailable
Curiously, for two iterations (reliably, in my tests, dozens of trials), poll _does not_ return POLLOUT, but sendmsg succeeds anyway! But then on the next iteration, poll returns POLLOUT but sendmsg fails with EAGAIN.
>Fix:
Yes, please!
1. It's not clear that this application behaviour is really sensible (it's kind of a pathological case in a test suite), but it's also not clear exactly what part of the behaviour is wrong, so I think we should address it.
2. It's probably not reasonable for a systemwide limit on file descriptors in transit over a socket to make poll fail to return POLLOUT, but (a) there's no way to tell poll that you really just want to send more fds and not more data, and (b) we currently don't have any per-socket accounting of fds in transit. So we might have to create per-socket accounting for this.
3. The weird behaviour of two !POLLOUT returns followed by a POLLOUT return _and_ sendmsg failure needs an explanation.
Home |
Main Index |
Thread Index |
Old Index