Subject: sed documentation
To: None <current-users@NetBSD.ORG, rhialto@mbfys.kun.nl>
From: Olaf Seibert <rhialto@mbfys.kun.nl>
List: current-users
Date: 10/16/1995 14:18:19
I just discovered the following documentation "gotcha" regarding
sed. (It is in 1.0 but I can't easily check if it's still in -current).
For the description of regular expressions, the sed(1) man page refers
to regex(3) which refers to re_format(7):
sed(1):
Sed Regular Expressions
The sed regular expressions are basic regular expressions (BRE's, see
regex(3) for more information). In addition, sed has the following two
additions to BRE's:
regex(3):
DESCRIPTION
These routines implement POSIX 1003.2 regular expressions
(``RE''s); see re_format(7). Regcomp compiles an RE writ-
re_format(7):
A piece is an atom possibly followed by a single- `*',
`+', `?', or bound. An atom followed by `*' matches a
sequence of 0 or more matches of the atom. An atom fol-
lowed by `+' matches a sequence of 1 or more matches of
the atom. An atom followed by `?' matches a sequence of 0
or 1 matches of the atom.
[...]
An atom is [...]
, or a single character with no other signifi-
cance (matching that character). [...]
Here we get to my problem: sed does not seem to support the '?' suffix,
which earlier Unixes I used always did.
I tested this with
echo "hallo" | sed 's/al?/XX/'
which should of course print "hXXlo" but instead it prints "hallo".
I can get it to behave properly by using \{0,1\} instead (not \?).
Nex/nvi behave the same. grep wants \? or \{0,1\}, egrep works with ?
and no other variation.
A weird thing is that grep also claims to implement basic regular
expressions, but apparently does it differently. I also think that
grep is incorrect in requiring \? in a basic r.e.
Am I going mad or is sed (and grep) indeed incorrect?
-Olaf.
--
___ Copyright 1995 Olaf 'Rhialto' Seibert. All Rights Reserved.
\X/ You are not allowed to read this using any kind of Micro$oft product.