tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: vi vs. nvi
>>>>> On Mon, 11 Aug 2008 01:29:40 +0200,
Lubomir Sedlacik <salo%Xtrmntr.org@localhost> said:
>> According to the source code (*3), it seems OpenSolaris doesn't use
>> strcoll(3)/wcscoll(3), and always compares character code values,
>> although I may be missing something.
> Here goes:
>
> OS LANG CODESET(*1) result of regexec(3)
> ------------------------- -------------- ---------- --------------------
> Solaris 7 en_US ISO8859-1 not match
> Solaris 7 en_US.UTF-8 UTF-8 match
> Solaris 8 en_US ISO8859-1 not match
> Solaris 8 en_US.UTF-8 UTF-8 match
> Solaris 9 en_US ISO8859-1 not match
> Solaris 9 en_US.UTF-8 UTF-8 match
> Solaris 10 FCS en_US ISO8859-1 not match
> Solaris 10 FCS en_US.UTF-8 UTF-8 not match
> Solaris 10 Update 6 (*) en_US ISO8859-1 match
> Solaris 10 Update 6 (*) en_US.UTF-8 UTF-8 match
> Solaris Nevada b91 en_US ISO8859-1 match
> Solaris Nevada b91 en_US.UTF-8 UTF-8 match
> OpenSolaris 2008.05 + b94 en_US ISO8859-1 match
> OpenSolaris 2008.05 + b94 en_US.UTF-8 UTF-8 match
>
> (*) Not sure when exactly between FCS and U6 this changed. I could
> track it down to a patch number later if you want to know.
Hmm, thanks.
So I must miss something, and newer Solaris (including OpenSolaris)
always use collation order for range expressions even with Latin-1. ;-/
BTW, it seems the following "unspecified behavior" was introduced
at SUSv3:
>>>>> On Fri, Aug 08, 2008 at 09:10:30PM +0900, SODA Noriyuki said:
> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05
> 9.3.5 RE Bracket Expression
> In the POSIX locale, a range expression represents the set of
> collating elements that fall between two elements in the collation
> sequence, inclusive. In other locales, a range expression has
> unspecified behavior: strictly conforming applications shall not rely
> on whether the range expression is valid, or on the set of collating
> elements matched.
SUSv2 requested to use collation order without any exception:
http://www.opengroup.org/onlinepubs/007908799/xbd/re.html
A range expression represents the set of collating elements that
fall between two elements in the current collation sequence,
inclusively. It is expressed as the starting point and the ending
point separated by a hyphen (-).
Range expressions must not be used in portable applications
because their behaviour is dependent on the collating
sequence. Ranges will be treated according to the current
collating sequence, and include such characters that fall within
the range based on that collating sequence, regardless of
character values. This, however, means that the interpretation
will differ depending on collating sequence.
And maybe this change between SUSv2 and SUSv3 was made for compatibility
with Linux, because there were the following technical reports about
conflicts between SUS and the Linux Standard Base:
http://www.opengroup.org/personal/ajosey/tr28-07-2003.txt
http://www.opengroup.org/personal/ajosey/tr11-11-2005.txt
Range expression (such as [a-z]) can be based on code point order
instead of collating element order.
And there is such specification in LSB:
http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic.html
19.2. Regular Expressions
Range expression (such as [a-z]) can be based on code point order
instead of collating element order.
>>>>> On Mon, 11 Aug 2008 07:55:13 +1000,
Daniel Carosone <dan%geek.com.au@localhost> said:
> If so, then please, please, please let's not do that.
I interpret your request as "Please make NetBSD behave like Linux
instead of Solaris". :-)
And I think that's certainly better at least at first.
--
soda
Home |
Main Index |
Thread Index |
Old Index