Subject: re: CVS commit: src/sys/kern
To: None <christos@zoulas.com>
From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
List: tech-net
Date: 07/03/2006 09:42:36
> | maybe it depends on the definition of the right thing. :-)
> | (apparently the configure script doesn't think it the right thing.)
> | to me, it doesn't seem so strange.
>
> What is strange to me is that now we have a situation whereby by
> default we have select() returning that the socket is not writable,
> but a write() to it will succeed and not block. This does not seem
> right to me. What's worse, is that now, we'll never return that
> the socket is writable, unless the buffer is entirely empty.
writeability wrt select() is defined with sb_lowat in posix.
If a descriptor refers to a socket, the implied output function is the
sendmsg() function supplying an amount of normal data equal to the current
value of the SO_SNDLOWAT option for the socket.
so it's normal for select not to consider it writable.
> How about:
>
> Index: uipc_socket2.c
> ===================================================================
> RCS file: /cvsroot/src/sys/kern/uipc_socket2.c,v
> retrieving revision 1.73
> diff -u -u -r1.73 uipc_socket2.c
> --- uipc_socket2.c 1 Jul 2006 15:38:28 -0000 1.73
> +++ uipc_socket2.c 3 Jul 2006 00:12:35 -0000
> @@ -377,7 +377,11 @@
> int
> soreserve(struct socket *so, u_long sndcc, u_long rcvcc)
> {
> + u_long lowat = MAX(sock_loan_thresh, MCLBYTES);
> + u_long hiwat = lowat + PIPE_BUF;
>
> + if (sndcc < hiwat)
> + sndcc = hiwat;
> if (sbreserve(&so->so_snd, sndcc, so) == 0)
> goto bad;
> if (sbreserve(&so->so_rcv, rcvcc, so) == 0)
> @@ -385,7 +389,7 @@
> if (so->so_rcv.sb_lowat == 0)
> so->so_rcv.sb_lowat = 1;
> if (so->so_snd.sb_lowat == 0)
> - so->so_snd.sb_lowat = MCLBYTES;
> + so->so_snd.sb_lowat = lowat;
> if (so->so_snd.sb_lowat > so->so_snd.sb_hiwat)
> so->so_snd.sb_lowat = so->so_snd.sb_hiwat;
> return (0);
seems fine to me. please put a comment.
/*
* there's at least one application (a configure script of screen)
* which expects a fifo is writable even if it has "some" bytes
* in its buffer.
* so we want to make sure (hiwat - lowat) >= (some bytes).
*
* PIPE_BUF here is an arbitrary value chosen as (some bytes) above.
* we expect it's large enough for such applications.
*/
> | > It could also be a POSIX violation, because we are using sockets
> | > internally to implement pipes, there is expected behavior for pipes
> | > and operations < PIPE_BUF.
> |
> | i think <= PIPE_BUF writes are atomic as far as lowat >= PIPE_BUF.
> | i'm not aware of any posix pipe buffer requirements other than
> | PIPE_BUF atomicity.
>
> This is the text I am referring to:
>
> * Write requests to a pipe or FIFO will be handled the same as a
> regular file with the following exceptions:
>
> * Write requests of {PIPE_BUF} bytes or less will not be interleaved
> with data from other processes doing writes on the same pipe. Writes
> of greater than {PIPE_BUF} bytes may have data interleaved, on
> arbitrary boundaries, with writes by other processes, whether or
> not the O_NONBLOCK flag of the file status flags is set.
>
> I am worried that because of the current FIFO socket-based implementation
> setting lowat and hiwat inappropriately will break the above behavior.
it's what i meant by "PIPE_BUF atomicity". i don't think it's broken.
> BTW, did you see my posting on port-sparc regarding the vmem quantum
> size on the sparc?
no. i'll take a look later.
YAMAMOTO Takashi