NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/59055: recvmsg(2) fails to return partial fds in MSG_CTRUNC case



>Number:         59055
>Category:       kern
>Synopsis:       recvmsg(2) fails to return partial fds in MSG_CTRUNC case
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Feb 07 19:10:00 +0000 2025
>Originator:     Taylor R Campbell
>Release:        current, 10, 9, ...
>Organization:
The NotQuiteCMSG Foundation
>Environment:
>Description:
If a sender sends more file descriptors over a socket through ancillary data with cmsg(3) SCM_RIGHTS than the receiver has wittingly allocated space for, _sometimes_ all of the file descriptors are discarded.

For example, on x86, if the sender sends 3 fds and the receiver has allocated space for 2 fds using CMSG_SPACE(2 * sizeof(int)), all three fds are discarded and MSG_CTRUNC is set.

But if the sender sends _4_ fds and the receiver has allocated space for _3_ fds using CMSG_SPACE(3 * sizeof(int)), then all four fds make it through and CMSG_CTRUNC is not set (and the receiver has to handle the fourth fd!).

How does this happpen?

Say you're on x86 where socket ancillary buffers are aligned to multiples of 8 bytes and struct cmsghdr itself is 16 bytes.

When sending n file descriptors, the ancillary buffer is laid out like so, in a buffer of size CMSG_SPACE(n * sizeof(int)):

[0..4) cmsg_len
[4..8) cmsg_level
[8..12) cmsg_type
[12..16) (anonymous padding)
[16..20) fd[0]
[20..24) fd[1]
...

If n is odd, four padding bytes are appended to the end.

So if the sender send n=3 file descriptors, they must lay it out in a 32-byte buffer like so:

[0..4) cmsg_len
[4..8) cmsg_level
[8..12) cmsg_type
[12..16) (anonymous padding)
[16..20) fd[0]
[20..24) fd[1]
[24..28) fd[2]
[28..32) (anonymous padding)

If the receiver has prepared to receive only at most n=2 file descriptors, though, they will have a 24-byte buffer that they expect to lay out like so:

[0..4) cmsg_len
[4..8) cmsg_level
[8..12) cmsg_type
[12..16) (anonymous padding)
[16..20) fd[0]
[20..24) fd[1]

The kernel could simply close fd[2] and let fd[0] and fd[1] pass.  But it doesn't.  When the kernel finds there isn't enough space in the recvmsg msg_control buffer, it chucks everything:

    857 	for (m = control; m != NULL; ) {
    858 		cmsg = mtod(m, struct cmsghdr *);
    859 		i = m->m_len;
    860 		if (len < i) {
    861 			mp->msg_flags |= MSG_CTRUNC;
    862 			if (cmsg->cmsg_level == SOL_SOCKET
    863 			    && cmsg->cmsg_type == SCM_RIGHTS)
    864 				/* Do not truncate me ... */
    865 				break;
    866 			i = len;
    867 		}
    868 		error = copyout(mtod(m, void *), q, i);

https://nxr.netbsd.org/xref/src/sys/kern/uipc_syscalls.c?r=1.214#864

This behaviour was introduced in rev. 1.113 of uipc_syscalls.c back in 2007 by dsl@ with a commit message that doesn't give any reason for discarding _all_ descriptors when the control buffer is truncated.

https://mail-index.netbsd.org/source-changes/2007/06/24/msg187028.html

This confuses some applications like devel/capnproto which exercise this path and expect some of the descriptors to make it through (though I haven't reviewed to see if this logic is robust to different cmsg alignment constraints on different architectures):

kj/async-io-test.c++:751: failed: expected result.capCount == 2 [0 == 2]
[ FAIL ] kj/async-io-test.c++:717: legacy test: AsyncIo/ScmRightsTruncatedEven (31363 &#956;s)
>How-To-Repeat:
Tweak N in the following program -- that's the number of descriptors the sender sends, and one more than the number of descriptors the receiver allocates space to receive.  For N = 3, no descriptors are printed on the receiving end; for N = 4, four descriptors are printed on the receiving end.

#include <sys/socket.h>

#include <err.h>
#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>

enum { N = 3 };

int
main(void)
{
	int s[2], p[2*((N + 1)/2)], *q;
	union {
		struct cmsghdr hdr;
		char buf[CMSG_SPACE(N * sizeof(p[0]))];
	} cmsgbuf;
	struct msghdr msg;
	struct cmsghdr *cmsg;
	unsigned i, j;

	if (socketpair(AF_LOCAL, SOCK_STREAM|SOCK_NONBLOCK, 0, s) == -1)
		err(1, "socketpair");
	for (i = 0; i < N; i++) {
		if (pipe2(p + 2*i, O_NONBLOCK) == -1)
			err(1, "pipe2[%u]", i);
	}

	msg = (struct msghdr) {
		.msg_name = NULL,
		.msg_namelen = 0,
		.msg_iov = NULL,
		.msg_iovlen = 0,
		.msg_control = cmsgbuf.buf,
		.msg_controllen = CMSG_SPACE(N * sizeof(p[0])),
		.msg_flags = 0,
	};
	cmsg = CMSG_FIRSTHDR(&msg);
	cmsg->cmsg_len = CMSG_LEN(N * sizeof(p[0]));
	cmsg->cmsg_level = SOL_SOCKET;
	cmsg->cmsg_type = SCM_RIGHTS;
	memcpy(CMSG_DATA(cmsg), p, N * sizeof(p[0]));
	printf("* sendmsg\n");
	printf("msg_flags=0x%x\n", msg.msg_flags);
	printf("msg_controllen=%d\n", msg.msg_controllen);
	if (sendmsg(s[0], &msg, 0) == -1)
		err(1, "sendmsg");

	printf("\n");

	msg = (struct msghdr) {
		.msg_name = NULL,
		.msg_namelen = 0,
		.msg_iov = NULL,
		.msg_iovlen = 0,
		.msg_control = cmsgbuf.buf,
		.msg_controllen = CMSG_SPACE((N - 1) * sizeof(p[0])),
		.msg_flags = 0,
	};
	printf("* recvmsg\n");
	printf("msg_flags=0x%x\n", msg.msg_flags);
	printf("msg_controllen=%d\n", msg.msg_controllen);
	if (recvmsg(s[1], &msg, 0) == -1)
		err(1, "recvmsg");
	printf("->\n");
	printf("msg_flags=0x%x\n", msg.msg_flags);
	printf("msg_controllen=%d\n", msg.msg_controllen);
	for (i = 0, cmsg = CMSG_FIRSTHDR(&msg);
	     cmsg != NULL;
	     i++, cmsg = CMSG_NXTHDR(&msg, cmsg)) {
		printf("[%u] cmsg_len=%d\n", i, cmsg->cmsg_len);
		printf("[%u] cmsg_level=%d\n", i, cmsg->cmsg_level);
		printf("[%u] cmsg_type=%d\n", i, cmsg->cmsg_type);
		q = (int *)CMSG_DATA(cmsg);
		for (j = cmsg->cmsg_len - CMSG_LEN(0);
		     j >= sizeof(*q);
		     j -= sizeof(*q), q++)
			printf("[%u] fd %d\n", i, *q);
		printf("\n");
	}
	return 0;
}
>Fix:
Close as many descriptors as we need to fit in recvmsg msg_controllen bytes, not _all_ of the descriptors.

(Also, let's add some tests for this!  cmsg is notoriously difficult and needs extensive automatic testing.)



Home | Main Index | Thread Index | Old Index