tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: shell prefix/suffix removal with quoted word
On Fri, 27 Jul 2018 21:49:03 +0700
Robert Elz <kre%munnari.OZ.AU@localhost> wrote:
> Date: Fri, 27 Jul 2018 13:39:18 +0200
> From: Edgar =?iso-8859-1?B?RnXf?= <ef%math.uni-bonn.de@localhost>
> Message-ID: <20180727113917.GD48007%trav.math.uni-bonn.de@localhost>
>
> | It has been brought to my attention that quoting the "word" in sh's
> | substring processing causes word to be matched literally rather than
> | being treated as a pattern.
>
> Yes. Or rather, more accurately, it is still treated as a pattern,
> but one with no meta-characters (everything is a literal) - just as
> in regular experssions (which shell patterns are not) the R.E.
> name
> is a perfectly valid RE ("grep name file..." works) - just a kind of
> boring one.
>
> | x="abc"
> | y="?"
> | echo "${x#"$y"}"
> | outputs "abc", while
> | x="abc"
> | y="?"
> | echo "${x#$y}"
> | outputs "bc".
>
> Yes. But those are the simple cases. The recent pattern
> changes to sh deal with what happens when
> y='\?'
> when the same rule you expressed applies, but in a much
> messier context. This is where sh used to not perform
> very well (and in HEAD is now, I think, much better) and
> many (perhaps most) other shells also have "issues".
>
> | I can't see this behaviour specified by SUS
>
> There is a work item to improve the way that pattern
> matching is specified in the next edition of the posix
> spec (some of what is there now is wrong, worse than
> just missing). How effective this will turn out to be is
> yet to be seen (there are people who prefer to "fix" things
> in a way that requires minimal changes, rather than just
> ripping out what is there and replacing it with something
> better, which is what this really needs.)
I have susv4tc2. It is specified, but clarification should be done. Look at the end of specification text in section 2.6.2 (Parameter Expansion), just before the informative text:
Enclosing the full parameter expansion string in double-
quotes shall not cause the following four varieties of
pattern characters to be quoted, whereas _quoting_
_characters_within_the_braces_shall_have_this_effect_.
. . .
${parameter%[word]}
Remove Smallest Suffix Pattern. The word shall be
expanded to produce a pattern. The parameter expansion
shall then result in parameter, with the smallest portion
of the suffix matched by the pattern deleted. If present,
word shall not begin with an unquoted '%'.
${parameter%%[word]}
Remove Largest Suffix Pattern. The word shall be expanded
to produce a pattern. The parameter expansion shall then
result in parameter, with the largest portion of the suffix
matched by the pattern deleted.
And at the end of the informative matter:
The double-quoting of patterns is different depending on where
the double-quotes are placed:
"${x#*}"
The <asterisk> is a pattern character.
${x#"*"}
The literal <asterisk> is quoted and not special.
> | nor mentioned in sh(1).
>
> I have uncommitted changes to the pattern section of
> sh(1) which I hope will eventually improve things there.
>
> I am not yet really happy with the new wording though,
> so they remain uncommitted (as in previous episodes,
> I prefer writing C to English...)
>
> | bash and ksh seem to behave the same.
>
> Yes, this has never really been in doubt, it has been that
> way ever since ksh added the # and % operators -- and
> chamged the quoting rules inside var expansions to the
> rational form that you showed above from the irrational
> that was, probably from PDP-11 space limitations, in the
> original Bourne shell, and remains to this day, for all the
> other forms ( "${var-"word"}" means, as far as quoting is
> concerned, something totally different than "${var#"word"}" )
>
> kre
>
--
roarde
Home |
Main Index |
Thread Index |
Old Index