Subject: NetBSD-1.0 (Nov07) and old binaries
To: None <current-users@netbsd.org>
From: Simon J. Gerraty <sjg@zen.void.oz.au>
List: current-users
Date: 11/09/1994 09:10:06
Further to my bleetings yesterday.
I built a kernel from the tar_files of Nov 7, with KTRACE enabled and
ran that offending BSDi binary. The ktrace suuports my theory that
the problem was a file locking issue. I believe (with little evidence)
that its related to 32 bit off_t's. There may be another issue
lurking here too...
First the tail of the ktrace with some comments. At the end is a
brief discussion of supporting old binaries like this.
(ok here is the start...)
184 ktrace RET ktrace 0
184 ktrace CALL execve(0xf7bfdb91,0xf7bfdaf0,0xf7bfdafc)
184 ktrace NAMI "/usr/MHSnet/_lib/netstate"
184 netstate RET execve 0
...
...
184 netstate CALL open(0x36000,0,0x33a50)
184 netstate NAMI "/var/spool/MHSnet/_lib/privsfile"
184 netstate RET open -1 errno 2 No such file or directory
Up to this point everything is cool.
...
The following looks a bit sus - maybe there is some doco somewhere I
need to read about changes from 0.9a to 1.0...
Pointers appreciated.
184 netstate CALL setgid(0x1)
184 netstate RET setgid -1 errno 1 Operation not permitted
184 netstate CALL geteuid
184 netstate RET geteuid 1
184 netstate CALL setuid(0x1)
184 netstate RET setuid -1 errno 1 Operation not permitted
184 netstate CALL geteuid
184 netstate RET geteuid 1
184 netstate CALL geteuid
184 netstate RET geteuid 1
184 netstate CALL break(0x42ffc)
184 netstate RET break 0
184 netstate CALL sigprocmask(0x1,0)
184 netstate RET sigprocmask 0
184 netstate CALL sigaction(0xe,0xf7bfd9fc,0xf7bfd9f0)
184 netstate RET sigaction 0
184 netstate CALL setitimer(0,0xf7bfd9f4,0xf7bfd9e4)
184 netstate RET setitimer 0
184 netstate CALL old.stat(0x360c0,0xf7bfda20)
184 netstate NAMI "/var/spool/MHSnet/_state/lock"
184 netstate RET old.stat 0
184 netstate CALL open(0x360c0,0x2,0xf7bfdb0c)
184 netstate NAMI "/var/spool/MHSnet/_state/lock"
184 netstate RET open 3
Ok here is the killer. According to fcntl.h 0x9 is
#define F_SETLKW 9 /* F_SETLK; wait if blocked */
184 netstate CALL fcntl(0x3,0x9,0xf7bfd9e4)
184 netstate RET fcntl -1 errno 22 Invalid argument
184 netstate CALL ioctl(0x2,0x402c7413 ,0xf7bfd964)
184 netstate RET ioctl 0
184 netstate CALL write(0x2,0xf7bfd2f4,0xa)
184 netstate GIO fd 2 wrote 10 bytes
"netstate: "
184 netstate RET write 10/0xa
184 netstate CALL write(0x2,0x2e3d4,0xc)
184 netstate GIO fd 2 wrote 12 bytes
"system error"
184 netstate RET write 12/0xc
184 netstate CALL write(0x2,0x2e479,0x4)
184 netstate GIO fd 2 wrote 4 bytes
" -- "
184 netstate RET write 4
184 netstate CALL write(0x2,0xf7bfd300,0x2e)
184 netstate GIO fd 2 wrote 46 bytes
"Could not lock "/var/spool/MHSnet/_state/lock""
184 netstate RET write 46/0x2e
184 netstate CALL write(0x2,0xf7bfd314,0x12)
184 netstate GIO fd 2 wrote 18 bytes
": Invalid argument"
184 netstate RET write 18/0x12
184 netstate CALL write(0x2,0x2f73f,0x1)
184 netstate GIO fd 2 wrote 1 bytes
"
"
184 netstate RET write 1
184 netstate CALL exit(0x47)
Now my _guess_ is that the kernel does not like the 32bit off_t's in the
flock struct.
My question is:
Since we have old.stat(), old.lseek() etc to support
old bins, do we need an old.fcntl() ? Or would adding some checks to
fcntl() do the trick.
It may be considered too late to introduce an old.fcntl()...
Perhaps some extra logic in fcntl() would suffice.
In the case above, the kernel is rejecting the flock struct because
the values don't look right (guessing), it could try again using
off32_t and see if that looks better. Not as reliable as an
old.fcntl() but...
Is this something that is being addressed?