NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/57544: sed(1) and regex(3) problem with encoding
>Number: 57544
>Category: bin
>Synopsis: sed(1) and regex(3) problem with encoding
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jul 26 17:05:00 +0000 2023
>Originator: Thierry LARONDE
>Release: NetBSD 10.0_BETA
>Organization:
>Environment:
NetBSD cauchy.polynum.local 10.0_BETA NetBSD 10.0_BETA (cauchy) #0: Mon Feb 27 11:28:34 CET 2023 tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-9.3-amd64_netbsd-amd64/netbsd/obj/sys/arch/amd64/compile/cauchy amd64
>Description:
$ export LC_CTYPE=fr_FR.ISO8859-15
and then:
$ echo "éé" | sed 's/é/\é/g'
sed: 1: "s/é/\é/g": RE error: trailing backslash (\)
$ export LC_CTYPE=POSIX.ISO8859-15 # incorrect setting but...
$ echo "éé" | sed 's/é/\é/g'
éé
From a test by Martin HUSEMANN, the problem is on arch where
char == signed char. (On Apple POWERMAC_G5.MP, as expected.)
Note: this is a regression from 9.3 and can be not solved, but masked,
by:
- (void) setlocale(LC_ALL, "");
+ (void) setlocale(LC_ALL, "POSIX");
probably in every text utility using regex(3).
>How-To-Repeat:
$ export LC_CTYPE=fr_FR.ISO8859-15
$ echo "éé" | sed 's/é/\é/g'
sed: 1: "s/é/\é/g": RE error: trailing backslash (\)
(On arch where char == signed char as amd64)
>Fix:
Not fixing: problem is lurking. Circumventing:
- (void) setlocale(LC_ALL, "");
+ (void) setlocale(LC_ALL, "POSIX");
Home |
Main Index |
Thread Index |
Old Index