Subject: Re: race in select() ?
To: David Laight <david@l8s.co.uk>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 10/09/2003 21:43:58
--LQksG6bCIzRHxTLp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Thu, Oct 09, 2003 at 06:01:53PM +0100, David Laight wrote:
> On Thu, Oct 09, 2003 at 05:20:10PM +0200, Manuel Bouyer wrote:
> > On Thu, Oct 09, 2003 at 02:13:51PM +0100, David Laight wrote:
> > >
> > > I presume inetd takes the fd out of its select list until the rpc.rstatd
> > > process exits. Otherwise there would be a nasty loop.
> >
> > Yes it does.
>
> There is a timing bug in inetd itself. rev 1.76 reads:
>
> readable = allsock;
> if ((n = select(maxsock + 1, &readable, (fd_set *)0,
> ...
>
> and reapchild does:
> FD_SET(sep->se_fd, &allsock);
>
> So if a child exits after the copy is made, but before the system call
> the select won't be looking for the correct fdset.
> If reapchild set the bit in readable then there could be a false
> positive if the SIGCHLD happened in the system call return path,
> that might be easier to check though (eg detect it happening, and
> call select again).
>
> Of course, you may not be hitting this one...
Yes, I'm almost sur this is it. I may have got it off by one when I checked
the fd_sets values in the kernel (the rpc.statd socket is the last one).
I'll try to check this when the bug shows up again (it happens once in a
few days, but I couldn't reproduce it at will).
About this race: I think the attached patch should fix it.
To avoid the race before select, the sigchld handler adds the bit in allsock
and the fd_set we will select on (allsock_select).
To avoid the false positive on select return, we check both the fd_set
we got from select, and the one we had before copying allsock to
allsock_select.
I'm not sure if allsock should be declared volatile; could the compiler
optimise this with registers load/store and keep values in registers
between the 2 copies here:
readable = allsock;
allsock_select = allsock;
This would of course require an architecture with enouth registers, or
a smaller value for FD_SETSIZE.
There is still a race left between the copy to readable and the copy to
allsock_select, but it's harmless: in this case, we won't handle the descriptor
after select returns, and handle it at the next loop.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 24 ans d'experience feront toujours la difference
--
--LQksG6bCIzRHxTLp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
? .gdbinit
? inetd
? inetd.cat8
Index: inetd.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/inetd/inetd.c,v
retrieving revision 1.76
diff -u -r1.76 inetd.c
--- inetd.c 2002/01/21 14:42:28 1.76
+++ inetd.c 2003/10/09 19:39:02
@@ -272,7 +272,7 @@
int lflag;
#endif
int nsock, maxsock;
-fd_set allsock;
+fd_set allsock, allsock_select;
int options;
int timingout;
struct servent *sp;
@@ -532,7 +532,8 @@
(void) sigsetmask(0L);
}
readable = allsock;
- if ((n = select(maxsock + 1, &readable, (fd_set *)0,
+ allsock_select = allsock;
+ if ((n = select(maxsock + 1, &allsock_select, (fd_set *)0,
(fd_set *)0, (struct timeval *)0)) <= 0) {
if (n == -1 && errno != EINTR) {
syslog(LOG_WARNING, "select: %m");
@@ -542,7 +543,8 @@
}
for (sep = servtab; n && sep; sep = nsep) {
nsep = sep->se_next;
- if (sep->se_fd != -1 && FD_ISSET(sep->se_fd, &readable)) {
+ if (sep->se_fd != -1 && FD_ISSET(sep->se_fd, &readable) &&
+ FD_ISSET(sep->se_fd, &allsock_select)) {
n--;
if (debug)
fprintf(stderr, "someone wants %s\n", sep->se_service);
@@ -770,6 +772,7 @@
sep->se_server, WTERMSIG(status));
sep->se_wait = 1;
FD_SET(sep->se_fd, &allsock);
+ FD_SET(sep->se_fd, &allsock_select);
nsock++;
if (debug)
fprintf(stderr, "restored %s, fd %d\n",
--LQksG6bCIzRHxTLp--