tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
sed(1) and LC_CTYPE
If setting LC_CTYPE to this:
$ export LC_CTYPE=fr_FR.ISO8859-15
and then:
$ echo "éé" | sed 's/é/\é/g'
sed: 1: "s/é/\é/g": RE error: trailing backslash (\)
Where does the program manage to find a backslash i.e. 0134? While
'é' is 0351.
Since, to my knowledge, we do not support anything via iconv or
whatever, shouldn't we assume simply a string of bytes \`a la C,
that is:
diff --git a/usr.bin/sed/main.c b/usr.bin/sed/main.c
index d87bce2a5c85..c6b69a83cd57 100644
--- a/usr.bin/sed/main.c
+++ b/usr.bin/sed/main.c
@@ -136,7 +136,7 @@ main(int argc, char *argv[])
char *temp_arg;
setprogname(argv[0]);
- (void) setlocale(LC_ALL, "");
+ (void) setlocale(LC_ALL, "POSIX");
fflag = 0;
inplace = NULL;
? With such a change, the result is:
$ echo "éé" | ./sed 's/é/\é/g'
éé
and this is what I expected.
What is the rationale for taking environment when all the code in the
src expects ASCII to start with? (for commands, range and so on).
What am I doing wrong?
--
Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Home |
Main Index |
Thread Index |
Old Index