tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: shell (/bin/sh) pattern matching bugs
Well, discard my suggestion : you have already answered: the problem
is that the "*" is already used as meaning litteral '*' so my point
of view is incompatible with the existing standard.
It's a mess... Wouldn't it be simple for POSIX to let the case...esac
as is and introduce a ecase...esac[e] (à la grep, egrep) with something
making more sense for corner cases?
On Sun, Jun 24, 2018 at 03:44:00PM +0200, tlaronde%polynum.com@localhost wrote:
> [For reader, please refer to Robert Elz' whole enlightening answer. I
> edit it]
> On Sun, Jun 24, 2018 at 07:49:25PM +0700, Robert Elz wrote:
> > | - [Suppression of the double quotes?
> >
> > This is, of course, the heart of the matter...
> >
> > In POSIX, quote removal is explicitly not done on case
> > patterns. that is, the expansions that are done are listed,
> > and quote removal is not one of them.
> >
> > So...
> >
> > | But this doesn't change anything in
> > | the bracket expression];
> >
> > It would, as, assuming the current literal text, an input string
> > which was a double quote (as in '"' or \") would match, as the
> > double quote character would appear in the [ ] expression
> > in the pattern.
> >
> > Of course that is clearly absurd, and a bug report on the posix
> > text was submitted a while ago to include quote removal in the
> > list of operations to preform on case patterns.
> >
> > Unfortunately, it isn't that simple, as just doing quote
> > removal on patterns would cause
> >
> > case x in ("*") echo match;; esac
> >
> > to match as the quote removal would leave the
> > pattern being just an asterisk, which matches anything,
> > which is not what is supposed to happen.
> >
> > So the current proposed new text (which had been
> > accepted, but now is being discussed again, and will
> > be changed) also specified that along with quote removal,
> > any "pattern magic" characters in the quoted part of the
> > pattern would be \ escaped so they remained literal,
> > so "quote removal" of the "*" would produce \* not *
> > and so the pattern matching would look for a literal
> > asterisk rather than anything - which is what is wanted.
>
> Thanks for the explanations!
>
> FWIW, as a POSIX shell user, I would expect something more intuitive
> than what is proposed (if I understand correctly):
>
> a) In all contexts, including the case patterns, substitutions including
> quote removal are done;
>
> b) _After that_, the patterns are interpreted according to their own
> rules, including if double quotes escaped are still there, with string
> of litterals.
>
> That is:
>
> var="[:alpha:]"
>
> (["$var"]) would lead after a) to ([[:alpha:]]) and then '[' would not
> match
>
> while
>
> ([\"$var\"]) would lead after a) to (["[:alpha:]"]) and then
> '"[:alpha:]"' being interpreted as a string of litterals, '[' would
> match.
>
> I think that POSIX shell users like me are used to the escaping dance
> when they feed sed(1) in a shell with a (not shell) regular expression,
> so it seems to me that this should be reasonably backward compatible
> be the least surprise case.
>
> Just my 2 cents.
>
> Best regards.
>
> PS: I don't know if you have already modified the sh(1) man page (I'm on
> 7.1.1 not on current), but I think that the case grammar should say that
> the (pattern) expression is valid, the first '(' being optional---since
> in all examples, and in the man page, there is always "pattern)", the
> (pattern) expression can be surprising the "(...)" being used in some
> shells for lists or arrays.
> --
> Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
> http://www.kergis.com/
> http://www.sbfa.fr/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
--
Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
http://www.kergis.com/
http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Home |
Main Index |
Thread Index |
Old Index