Subject: kern/6052: race condition in link() call
To: None <gnats-bugs@gnats.netbsd.org>
From: Liz S. Reynolds <ilaine@panix.com>
List: netbsd-bugs
Date: 08/26/1998 17:06:51
>Number: 6052
>Category: kern
>Synopsis: link() call can be raced resulting in false success return
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Aug 26 14:20:01 1998
>Last-Modified:
>Originator: Liz S. Reynolds
>Organization:
Public Access Networks (Panix)
>Release: NetBSD-1.3.2
>Environment:
System: NetBSD panix7.panix.com 1.3.2 NetBSD 1.3.2 (PANIX-USER) #0: Thu Aug 13 15:03:31 EDT 1998 marcotte@juggler.panix.com:/devel/netbsd/1.3.2/src/sys/arch/i386/compile/PANIX-USER i386
>Description:
It is possible for 2 processes to return success linking to the
same file simultaniously. This does not make a tri-linked file,
the race winner actually has the link, and the loser does not. One
can distinguish by checking the inode number of the newly created
file and comparing it to the inode of the original file. In the
race winner they are the same, in the loser they are different.
This only happens on an nfs-mounted partition, but it works for
several architectures on the server side. I have tested it with
NetBSD, SunOS, and Network Appliances, with exactly the same
results. I cannot cause the failure running the script on a Sun
machine nfsmounting the same partitions. This locates the problem
in the NetBSD nfs client code.
>How-To-Repeat:
/****************************************************************************
linktest.c
The purpose of this program is to demonstrate the existence of a
race condition in the link() system call. It works (that is, fails)
on netbsd when the test directory is mounted via nfs.
linktest is a single test of link() and should be run from a script
that forks many instances of linktest and races them against each
other. A simple example, sufficient to cause several examples of
anomolous output on my system is provided:
--snip--
#! /usr/local/bin/perl
$| = 1;
while ($procs < 30){
$pid=fork();
if ($pid == 0){
$procs++;
}
elsif ($pid > 0) {
while ($tries < 25){
system("./linktest");
$tries++;
}
exit (0);
}
}
--snip--
and some sample output:
ok
ok
ok
ok
ok
ok
ok
non-equal inode #s 536594 536592
ok
ok
ok
ok
non-equal inode #s 536595 536594
ok
ok
ok
ok
--snip--
****************************************************************************/
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
#include <time.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
time_t main(int argc, char *argv[])
{
int tmpfd;
struct stat tmp_sbuf, win_sbuf;
char linkfilename [] = "./thewinner";
char tpath [_POSIX_PATH_MAX];
/*
create a temp file with a semi-random name
verify that the create was successful
*/
srandom(getpid()+time(0));
sprintf(tpath,"test_tmp%i.%li", getpid(), random());
tmpfd = open(tpath, O_CREAT|O_WRONLY|O_EXCL, S_IRUSR|S_IWUSR);
if (tmpfd < 0) {
close(tmpfd);
unlink(tpath);
exit (-2);
}
/* stat the file descriptor, make sure it really worked */
if (fstat(tmpfd, &tmp_sbuf) < 0) {
close(tmpfd);
unlink(tpath);
exit (-2);
}
/* attempt to create linkfilename via link() */
if (link(tpath, linkfilename) == 0) {
/* we have the link */
unlink(tpath);
close(tmpfd);
/* stat the new file */
if (lstat(linkfilename, &win_sbuf) < 0) {
unlink(linkfilename);
exit (-2);
}
/*
The stat structure from the original temp file and the new
file created by link() should be exactly the same
if they are not another process has successfully raced us.
We cannot detect the race winner, only the loser.
*/
if (tmp_sbuf.st_ino!=win_sbuf.st_ino) {
printf ("non-equal inode #s %i %i\n",
tmp_sbuf.st_ino,win_sbuf.st_ino);
exit (-3);
}
else
printf ("ok\n");
unlink (linkfilename);
exit (0);
}
/*
link() returned non-zero, we don't care about this case
so exit silently.
*/
errno = EEXIST;
unlink(tpath);
close(tmpfd);
exit (-1);
}
>Fix:
I am depending on the atomicity of the link() call for the correct
working of an advisory locking system. Users have reported file
corruption caused by failure of this routine when two programs
believe they both have a lock on the database at the same time.
I have a workaround suitable for my purposes by detecting the fact
that the new file doesn't have the same inode as the temp file.
That process is determined not to have the lock despite the
successful return from link().
This is not a general-purpose fix as I have no way of knowing what
other software on the system depends on the correct working of this
call.
>Audit-Trail:
>Unformatted: