Re: sh(1): POSIX "Command Search and Execution"

To: Robert Elz <kre%munnari.OZ.AU@localhost>
Subject: Re: sh(1): POSIX "Command Search and Execution"
From: tlaronde%kergis.com@localhost
Date: Sat, 21 Sep 2024 09:47:09 +0200
Thanks for the detailed explanations! I kept your whole text and I'm
only answering on some things: one can reach the answers interpolated
searching for '^#TL':

On Sat, Sep 21, 2024 at 04:27:54AM +0700, Robert Elz wrote:
>     Date:        Fri, 20 Sep 2024 18:53:11 +0200
>     From:        <tlaronde%kergis.com@localhost>
>     Message-ID:  <Zu2od2PHTSn4H0td%kergis.com@localhost>
> 
>   | For some reason[*], I looked at sh(1) "Command Search and Execution"
>   | in POSIX (issue 7 2018 and then issue 8 2024).
> 
> Over the past many years this has been one of the most debated
> parts of the specification.   It is constantly being reworded.
> 
>   | From the specification above, I'm puzzled about two things: regular
>   | built-ins and PATH search:
> 
> Yes, that aspect in particular, there is an attitude amongst some of
> the people who work on the standard that users must be able to replace
> regular built in utilities by their own replacements, simply by placing
> their own in a directory that is in PATH before the directory where the
> standard version of the utility exists.
> 
> Almost all shell developers consider this to be nonsense, and refuse
> to have anything to do with it (some version of ksh93 is reputed to
> have rules something like that implemented though.)
> 
> In issue 7 and before (not sure now for how long before, but that no longer
> matters) all regular built-in utilities were required to have a file
> system implementation (so that, for example, xargs could run it, without
> a shell being involved) - even those which make absolutely no sense
> outside the shell, like "wait" and "fg" (some others which are mostly
> useless outside the shell, like "cd" could at least be argued to be able
> to attempt the operation, and issue an error on failure, even if the
> effects would be lost).   Some systems install such things by making
> links to a script like
> 
> 	#! /bin/sh
> 	${0##*/} "$@"
> 
> with the names of the relevant built-in utilities, solely to meet that
> requirement.   NetBSD always refused to indulge in such stupidity.
> 
>   | In issue 7, built-ins are segregated in two groups: "special
>   | built-ins" and "regular built-ins", the latter being the complement of
>   | the former (a built-in that is not "special").
> 
> That's always been done - there are other differences in how they're
> required to operate than in this area - such as what happens when one
> fails, and the effects of variable assignments as part of the same
> command.  The special built-ins are mostly things that most people almost
> consider to be syntax (like "break" "continue" "return" "." ...)
> 
>   | But in the spec, a regular built-in can only be invoked in e),
> 
> Not quite, the utilities listed in (d) are all regular built-in
> utilities, and those simply get executed.   This is the (useful)
> big change in Issue 8 - that list are now knows as "intrinsic"
> utilities, which have two properties of note - first, those ones
> aren't required to exist in the filesystem any more, and second,
> they're exempt from the path search nonsense.
> 
> Fortunately, implementations are also allowed to designate any
> other built-in utility as being intrinsic (though it is recommended
> that they don't).   In our shell, every built-in is intrinsic.
> (I believe bash is the same).
> 
>   | that is the corresponding name file has to be accessible via the PATH.
>   | If it is not, one can not invoke a regular built-in?
> 
> That is the intent, yes.
> 
>   | This may have sense for an utility required by POSIX
> 
> No, it makes sense for nothing.
> 
>   | but there may be a regular built-in that POSIX doesn't speak about...
> 
> That one is actually not a problem - both because such a utility could
> also be implemented as a file system command, and so meet the requirement,
> but more because as soon as an application attempts to invoke any non
> standard utility, all bets are off, that's outside what the standard
> specifies, and so the standard specifies nothing about what should happen.
> 
> And yes, that means that if you write your own command (or add one from
> pkgsrc, that is not a standard utility) then the standard doesn't require
> that things like redirection (or anything else really) will work.
> 
> Of course, no real implementation would ever break things that way, what
> is a standard utility, and what is not, is not distinguished anywhere
> (except that to conform with the standard, all the standard utilities,
> except the ones that are part of options that are not included, like for
> example uucp and its friends, must be implemented, and available in
> some defined PATH setting - which isn't necessarily the one that any
> normal user ever uses.)
> 
>   | And what does "a successful search" mean? From the referenced
>   | paragraph "XBD Environment Variables":
>   |
>   | ---8<---issue 7 2018
>   | The list shall be searched from beginning to end, applying the
>   | filename to each prefix, until an executable file with the specified
>   | name and appropriate execution permissions is found. 
>   | --->8---issue 7 2018
>   |
>   | But this contradicts the use of the shell in the paragraph I'm talking
>   | about, since if the permissions can be stat'ed, the "executable" nature
>   | of the file can not be ascertained without exec'ing
> 
> I think that's just a wording bug, and should be fixed (and would be
> if someone pointed it out) - all they really mean is a file with 'x'
> permission in PATH.   However, you're right, the term "executable file"
> is defined to mean something that "exec*(2)" can execute, and that isn't
> what they really mean there - no-one expects an attempt be made to actually
> execute the file located, just that the shell would try that if there was
> no built-in to execute instead.
> 

#TL

Then could you point it out to the committee? ;-)

>   | ---8<---issue 7 2018
>   | The term "built-in" implies that the shell can execute the utility
>   | directly and does not need to search for it. 
>   | --->8---issue 7 2018
>   |
>   | The proposition is for all built-ins. And this contradicts the
>   | paragraph where the built-in has to be searched for previously...
> 
> No, it doesn't - the version searched for (and found) (if you believe
> anything should actually operate like that) isn't the built-in, that's
> the file system equivalent (like we have /bin/echo and the built-in
> echo in sh(1), which are actually entirely different commands).
> 
> The intent is that the shell locates the file system version of the
> executable, then, if there is a built-in with the same name, and that
> built-in claims to be the equivalent of the version in the directory
> in which the shell found the file system version, then the built-in
> is executed instead of the file system version.
> 
>   | "The special built-in utilities in this section need not be provided
>   | in a manner accessible via the exec family of functions defined in
>   | the System Interfaces volume of POSIX.1-2017."
> 
> Yes, not even the most insane of the posix committee ever believed that
> "break" or "return" would be useful in any way as a file system command.
> 
>   | i.e. not special built-ins have to be provided in a manner accessible
>   | via the exec family of functions.)
> 
> Yes, but (as above) only the standard ones - anything non standard (anywhere,
> including an option to a standard utility that isn't defined for that
> utility) places things outside the standard, and none of the rules apply.
> 
>   | What does:
>   |
>   | "the built-in or function is associated with the 
>   | directory that was most recently tested during the successful
>   | PATH search"
>   |
>   | mean? How is a directory "associated" to a built-in or a function,
> 
> No-one actually knows, that isn't specified anywhere, it is up to the
> implementation to make that work, but I believe that the intent is
> that each (non-intrinsic) regular built-in is associated with a path
> somewhere or other (compiled into the shell, in a file that the shell
> reads at startup, perhaps via a sysctl like interface - whatever the
> implementation prefers).   That is, for us we have "echo" "test" and
> "printf" (and more) built in, so something somewhere would have
> 
> 	echo	/bin
> 	test	/bin
> 	printf	/usr/bin
> 
> (and many more) defined - then if the user types "echo hello" the
> system searches PATH, finds "echo" in some directory, then checks
> this list - if the directory found by the search matches the one in
> the list, then the built-in gets executed.   If the directory is
> different, then the command from the file system gets executed, and
> as you surmised, if the command isn't found by the search, then a
> "command not found" error results (even though the built-in is there.)
> 
> So if you had PATH=~/bin:/usr/pkg/bin:/bin
> 
> and you had a "test" in ~/bin, "echo" in /usr/pkg/bin and no printf
> in any of those three directories, then the built-in versions would
> never be executed.
> 

#TL

It's clearer this way but as you wrote: "No-one actually knows".

Could the committee use something like:

"if the system has implemented the utility as a built-in or as a shell
function, and the built-in or function is associated by the
implementation with the directory in which the search succeeded, the
related built-in or function masks the utility and is invoked in its
stead at this point."

>   | Note: in the NetBSD implementation---I didn't look in the CSRG
>   | archives to see if these are in fact here from long ago---there are
>   | prefixes in the path: "%builtins" and "%func";
>   | perhaps are these an attempt to this association?
> 
> They are from long long long ago, yes.   %func is something entirely
> different, and unrelated, and not entirely useless.   "%builtins" was
> an attempt to comply with what the language in some much older version
> of the standard (when all this was much less precisely specified than
> it is now).  That's a joke, and most versions of ash (the parent of
> our shell, FreeBSD's dash, perhaps others) have long deleted it.
> We haven't, but probably should, it is undocumented, and no-one uses it.
> 
>   | These are builtins or funcs if the prefix is specified as the
>   | preceding "dir" in PATH?
> 
> I don't really want to document %builtins, so everyone forget you
> ever read this, but the idea is that if that is specified as a suffix
> of an entry in PATH, and the PATH search reaches that entry, then a
> built-in command will be found and executed (if there is no %builtins
> entry in PATH, then one is assumed right at the start, which means
> built-ins are always executed if named ... that's what almost everyone
> simply assumes will happen).   By explicitly sticking %builtins
> elsewhere, it is possible for a user to override a builtin with a
> file-system command located earlier in the PATH.
> 
> That's a dumb way to do it though, much better is simply to supply
> a function like:
> 
> 	echo()
> 	{
> 		/path/to/the/echo/I/like "$@"
> 	}
> 
> instead - and the usefulness of that is one of the reasons that the
> NetBSD shell always reads the $ENV file (even in non-interactive shells).
> This way you can selectively override built-ins (except the special ones
> that you really don't want to override) with whatever versions you prefer.
> 
> The %func thing is entirely different - if a search for a command reaches
> that directory (the one with %func as a suffix - just in PATH, not in the
> directory name) without having yet located the command (or we would not
> have gotten that far) and there is a file in the directory with the same
> name as the command being sought (I think this one needs 'r' permission,
> and not necessarily 'x', but I haven't checked, so might be wrong - 'r'
> is needed for sure though) then the shell will read that file, as if with
> the '.' command.   If after that has happened, there is now a function with
> the name of the command to be executed defined (clearly there wasn't before,
> or the function would have already been executed, without any PATH search)
> then the shell will execute the function (and search PATH no more).  The
> newly defined function (and anything else that running the script that was
> found happens to accomplish - normally just defining other functions as
> well) remains in the shell to be used again later if needed, with no PATH
> search involved.
> 
> The idea is that you make a file containing functions you sometimes use,
> place that file in some directory, say ~/myfuncs and link it to the name
> of every function it defines (the directory can have other groups of
> unrelated functions) and then you put ~/myfuncs%func as an entry in PATH
> (usually it would go fairly early in PATH - but that depends upon what
> you're attempting to achieve - perhaps last if the intent is to provide
> fallback versions of commands in case the system that you're using happens
> not to have them installed) - then when you happen to need to use one of
> those functions (you're doing something which needs one) then that function
> gets defined "by magic" along with any other related functions you're likely
> to use if you're using any of them.  On the other hand, if you never need
> these functions in a shell, then ten never get loaded, and so save a little
> memory in the shell, and a tiny bit of command search time.
> 

#TL

Yes, that part is definitively not useless. I for one use sh(1) and
POSIX utilities a lot and have written shell "libraries" to share code, that
are dot'ed in the scripts. But this feature (simply appending, in my
case, to the PATH, a flagged directory for the functions) is IMO
smart: one can organize a shell library of functions the way we
organize C code, splitting related functions in distinct chunks, and
symlinking every function to the related file.

Then, it is "copy on use", with the added benefit that the first
function of a related chunk brings the others. (Is changing the
definition of PATH clearing these from the hash table?)

For interactive shell, this doesn't add a lot. But when invoking
scripts in loops, not having to parse unused functions is a bonus.

>   | Could somebody explain this in an "international" english, that is
>   | something a not english native speaker with an average english
>   | vocabulary could parse?
> 
> I don't know, does the above count?
> 

#TL

Yes. But see my proposition above.

>   | [*]: The reason why I looked at the spec is that, under Plan9, there is
>   | a feature that I find quite neat and consistent: utilities can be
>   | organized in subfolders and one can invoke from the shell (rc(1)) an
>   | utility like this: "ip/ipconfig ...". This organized the utilities in
>   | groups, instead of putting everything flat in a directory.
> 
> Yes, some people like that, and there are one or two shells which allow
> it I think (not typical POSIX type shells) - you can accomplish that, more
> or less, by just adding all those directories to PATH, and then you get to
> avoid typing the "ip/" part of the name.
> 
>   | I thus wanted to see how I could add this (it is not POSIX compliant)
> 
> No, it isn't, POSIX requires that any command with a '/' in its name
> be simply executed (from the filesystem) using the name given, without
> any other processing (of the command name).
> 
>   | by setting an option, without disturbing much the POSIX behavior
>   | or introducing security problems that the POSIX spec had tried to
>   | address...
> 
> It isn't really a security issue I think - just isn't the way that
> shells have ever worked (way back to the Thompson shell) - either
> there's a path search (in that shell the directories to examine, and
> their order, was built into the shell, no way for users to alter it)
> for simple one-segment names (no '/') and others are simply exec'd.
> That's very hard to change now (in general, an option could allow it
> though) as it is so ingrained in how people work.
> 

#TL

I think it could be added without much ado, but only via setting an
option (since it would break scripts doing : "cd $SOMEDIR; bar/foo
..." and expecting "bar/foo" to be exec'ed in $SOMEDIR and not
searched for) and restraining the feature to pathname with at least
a '/' and starting by an alphanumeric (hence, ./bar/foo will not be
searched for).

KerGIS, my fork of public domain GRASS, has hundreds of programs
related to raster, vector and so on, and the utilities are r.bar.foo
(for raster) and v.bar.foo (for vector), that is the dot has been used
(by the original developers of GRASS) to flatten what is naturally
organized.

And with this example, you can see that adding in the PATH
subdirectories will not do, since there would be namespace pollution
and the utility used will depend on whether the raster dir is given
before the vector dir and so on in the PATH definition.


Thanks for the explanations!
-- 
        Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
References:
- sh(1): POSIX "Command Search and Execution"
  - From: tlaronde
- Re: sh(1): POSIX "Command Search and Execution"
  - From: Robert Elz
Prev by Date: Re: sh(1): POSIX "Command Search and Execution"
Next by Date: Re: sh(1): POSIX "Command Search and Execution"
Previous by Thread: Re: sh(1): POSIX "Command Search and Execution"
Next by Thread: Re: sh(1): POSIX "Command Search and Execution"
Indexes:
Home | Main Index | Thread Index | Old Index