NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/59058: env(1) exit status can be incorrect
>Number: 59058
>Category: bin
>Synopsis: env(1) exit status can be incorrect
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Feb 09 04:10:00 +0000 2025
>Originator: Robert Elz
>Release: NetBSD 10.99.12
>Organization:
>Environment:
System: NetBSD jacaranda.noi.kre.to 10.99.12 NetBSD 10.99.12 (JACARANDA:1.1-20250119) #172: Sun Jan 19 08:59:18 +07 2025 kre%jacaranda.noi.kre.to@localhost:/usr/obj/testing/kernels/amd64/JACARANDA amd64
Architecture: x86_64
Machine: amd64
>Description:
The man page for env(1) says:
EXIT STATUS
env exits with one of the following values:
[...]
126 utility was found, but could not be invoked.
127 utility could not be found.
and yet:
Script started on Sun Feb 9 09:12:49 2025
$ mkdir env-test
$ cd env-test
$ ln -s foo foo
$ env $(pwd)/foo/bar
env: /tmp/env-test/foo/bar: Too many levels of symbolic links
$ echo $?
126
$ exit
Script done on Sun Feb 9 09:13:37 2025
Here the utility clearly could not be found, it does not exist,
and yet the exit code is 126 "utility was found, but..." rather
than 127 which it should be.
Another less serious issue (I suppose this should really be a seperate
PR as a doc bug, but as we are already here) the same section includes:
1-125 utility was invoked, but failed [...]
and yet again (continuing to use the same environment as above):
Script started on Sun Feb 9 09:22:31 2025
$ cd env-test
$ env env $(pwd)/foo/bar
env: /tmp/env-test/foo/bar: Too many levels of symbolic links
$ echo $?
126
$ exit
Script done on Sun Feb 9 09:23:08 2025
Here the first "env" command returns exit status 126, which should
indicate
126 utility was found, but could not be invoked.
Yet here the utility is "env" which clearly can be found, and can be
invoked, and in fact, was invoked.
That 126 exit status (and the message sent to stderr) is actually the
exit status (and error message) from the env command invoked by the
env command whose status is being examined. That is, one of the exit
codes described by:
1-125 utility was invoked, but failed in some way;...
which is what happened here, the 2nd env was invoked, and failed (it
is the exact same invocation as the primary subject of this PR, which
was designed to fail) yet the exit code was not in the range 1-125 as
promised by the man page.
The problem here is obviously that the man page is promising something
which it is unable to deliver, a non-zero exit status from the utility
can be any value ... if we're still using one of the old wait(2)
interfaces to collect that status, it can be anything from 1..255, if
we're using waitid(2) or wait6(2) then it can be any (non-zero, as the
zero case is covered in a different line item, not included in this PR)
32 bit value.
And while we're here, more curiosity/weirdness than bugs of any kind,
a couple of other exit codes listed in the EXIT STATUS section are:
1 An invalid command line option was passed to env.
125 utility was specified together with the -0 option.
First, why devote a whole exit code (125) to something which doesn't
need to be an error at all? The -0 option is meaningless when a
utility is specified, its sole purpose is to alter the delimiter
between successive entries when the env command is used with no
utility, and instead prints the contents of the environment.
It is entirely normal for commands to have options that only apply in
specific cases, eg: grep doesn't complain if I do:
grep -i 1234 file
or
ls -c file
despite the fact that it is meaningless to request case-independent
matching of digits, and ls's -c option only does anything when (at
least) one of -l or -t is also given. There's no need to make -0
an error when a utility is given, simply ignore it. (-0 is not a
standard option, so we can do what we like with that one.)
On the other hand using '1' as the exit code for invalid options is
an exceedingly poor choice (unfortunately, it might be mandated by
POSIX, I'll check later).
If not mandated, it would better to make that one exit(125) (regardless
of what is, or isn't, done with the -0 case) to make it less likely to
conflict with the utility exiting with status 1 (which is a very common
exit code - when I tested thre grep above, just to be sure, it exited
with status 1, "1234" did not exist in the file I used).
>How-To-Repeat:
RTFM, and then as above (or many other similar ways).
>Fix:
First, I am going to assign this PR to myself, and fix what needs to
be fixed. The PR is just for tracking the fix, and pullups, ...
Of course, none of these issues are serious enough to warrant any
pullups, so I won't be requesting any of those, so instead let's say
this PR is in case anyone else wants to make any comments about the
issues. If you do, be quick, fixing this is not going to take very
long!
For the first (primary) issue, the problem is that env simply checks
for ENOENT from execvp() and does exit(127) if that is the error
returned, and exit(126) in all other cases. That's really the wrong
way, much better would be to do exit(126) if the error is ENOEXEC and
127 in all the other cases (there are lots of error codes that
indicate a path not found, not just ENOENT) - but that's not actually
good enough either, if the utility to be invoked were
#! /no/such/file
[...]
then we can find the utility with no issues, it cannot be invoked
however ("/no/such/file" doesn't exist, and yes, that's an assumption
I am making here, but it is correct in my environment) so env should
exit(126) - yet the errno value from execvp() in this case will be one
of the ones which indicates a file could not be found (that file being
"/no/such/file"). Detecting the difference requires more work than
just looking at the value of errno.
That is it does, unless the kernel were changed to map all errors
detected when attempting to locate the #! interpreter into ENOEXEC,
which it could do easily enough (and has been suggested as a possibility
in discussions about similar issues related to shell diagnostics) - but
that potential change is beyond the scope of this PR.
For the second (doc) issue, what I think should happen, is for the
EXIT STATUS section to say something like:
EXIT STATUS
If a utility is given, and is successfully invoked,
then the exit status is from that utility, see its
documentation for the possible values and details.
If no utility is given, or one is named, but cannot
be invoked, then env will exit with status:
followed by the list of exit codes, similar to what is there now,
but omitting all mention of exit status values from the utility.
I will also note that whenever env itself exits with a non-zero exit
status, it always also writes a diagnostic indicating why it failed
to standard error, which, when necessary, can help determine the
exit status source -- of course, a perverse utility could be just:
main()
{
fprintf(stderr, "env: unknown option -Q\n");
exit(125);
}
so there never really is any way to be certain (without ktrace anyway).
For the third (non-bug) issues, if POSIX allows it (sometimes the
standard lists specific code values, sometimes just 0 and not 0),
I will probably change any exit(1) in env into exit(125); and also
simply ignore a "-0" option when a utility is named, rather than
making that be an automatic error.
Home |
Main Index |
Thread Index |
Old Index