Subject: lib/33090: patch for new locales: ru_BY.CP1251, ru_RU.CP1251 and be_BY.CP1251
To: None <lib-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Aleksey Cheusov <cheusov@tut.by>
List: netbsd-bugs
Date: 03/16/2006 00:45:01
>Number: 33090
>Category: lib
>Synopsis: patch for new locales: ru_BY.CP1251, ru_RU.CP1251 and be_BY.CP1251
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: change-request
>Submitter-Id: net
>Arrival-Date: Thu Mar 16 00:45:00 +0000 2006
>Originator: Aleksey Cheusov <cheusov@tut.by>
>Release: NetBSD 3.0_STABLE
>Organization:
>Environment:
<The following information is extracted from your kernel. Please>
<append output of "ldd", "ident" where relevant (multiple lines).>
System: NetBSD chen 3.0_STABLE NetBSD 3.0_STABLE (GENERIC) #2: Sun Mar 12 12:49:58 GMT 2006 cheusov@chen:/usr/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
<precise description of the problem (multiple lines)>
>How-To-Repeat:
<code/input/activities to reproduce the problem (multiple lines)>
>Fix:
<how to correct or work around the problem, if known (multiple lines)>
Hi.
Can anybody apply a small patch to libc,
building a few new locales for Russian and Belarusian
languages?
I hope my patch didn't break anything.
Never touch libc (and NetBSD) before.
I tested these locales using ERE character classes
and tolower/toupper functions of awk.
Everything works correctly.
Patch is relative to $SRC/share/locale
Actually I just copy subpart of bg_BG.CP1251 file
to charset/CP1251 file and "#include" it
from new locates ctype/ files.
P.S.
What do such lines mean?
CHARSET ",A"
CHARSET "(I"
CHARSET "$(@"
All files in $SRC/share/locale/ctype/charset
contains this magic, my patch doesn't add it.
man mklocale says
"CHARSET Controls character set for subsequent runes"
The above magic symbols are not clear enough for me :-(
diff -urN ctype.orig/Makefile ctype/Makefile
--- ctype.orig/Makefile 2006-03-14 00:14:05.000000000 +0000
+++ ctype/Makefile 2006-03-13 23:34:49.000000000 +0000
@@ -9,6 +9,9 @@
FILESGRP= ${LOCALEGRP}
FILESMODE= ${LOCALEMODE}
+LOCALES += be_BY.CP1251
+ LOCALESRC_be_BY.CP1251 = be_BY.CP1251
+
LOCALES += bg_BG.CP1251
LOCALESRC_bg_BG.CP1251 = bg_BG.CP1251
@@ -189,9 +192,15 @@
LOCALES += pt_PT.ISO8859-15
LOCALESRC_pt_PT.ISO8859-15 = en_US.DIS_8859-15
+LOCALES += ru_BY.CP1251
+ LOCALESRC_ru_BY.CP1251 = ru_BY.CP1251
+
LOCALES += ru_RU.CP866
LOCALESRC_ru_RU.CP866 = ru_RU.CP866
+LOCALES += ru_RU.CP1251
+ LOCALESRC_ru_RU.CP1251 = ru_RU.CP1251
+
LOCALES += ru_RU.KOI8-R
LOCALESRC_ru_RU.KOI8-R = ru_RU.KOI8-R
diff -urN ctype.orig/be_BY.CP1251.src ctype/be_BY.CP1251.src
--- ctype.orig/be_BY.CP1251.src 1970-01-01 00:00:00.000000000 +0000
+++ ctype/be_BY.CP1251.src 2006-03-14 00:14:54.000000000 +0000
@@ -0,0 +1,11 @@
+/*
+ * LOCALE_CTYPE for Belarusian Cyrillic character set (CP1251)
+ */
+
+ENCODING "NONE"
+VARIABLE Belarusian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"
diff -urN ctype.orig/bg_BG.CP1251.src ctype/bg_BG.CP1251.src
--- ctype.orig/bg_BG.CP1251.src 2006-03-14 00:14:05.000000000 +0000
+++ ctype/bg_BG.CP1251.src 2006-03-13 23:52:47.000000000 +0000
@@ -11,81 +11,4 @@
/*
* This is a comment
*/
-ALPHA 'A' - 'Z' 'a' - 'z'
-ALPHA 0x80 0x81 0x83 0x8a 0x8c - 0x90 0x9a 0x9c - 0x9f
-ALPHA 0xa1 - 0xa3 0xa5 0xa8 0xaa 0xaf 0xb2 - 0xb4 0xb8 0xba
-ALPHA 0xbc - 0xff
-CONTROL 0x00 - 0x1f 0x7f 0x98
-DIGIT '0' - '9'
-GRAPH 0x21 - 0x7e 0x80 - 0x97 0x99 - 0x9f 0xa1 - 0xff
-LOWER 'a' - 'z' 0x83 0x90 0x9a 0x9c - 0x9f 0xa2 0xb3 0xb4 0xb8
-LOWER 0xba 0xbc 0xbe 0xbf 0xe0 - 0xff
-PUNCT 0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e
-PUNCT 0x82 0x84 - 0x89 0x8b 0x91 - 0x97 0x99 0x9b 0xa4
-PUNCT 0xa6 0xa7 0xa9 0xab - 0xae 0xb0 0xb1 0xb5 - 0xb7 0xb9 0xbb
-SPACE 0x09 - 0x0d 0x20 0xa0
-UPPER 'A' - 'Z' 0x80 0x81 0x8a 0x8c - 0x8f 0xa1 0xa3 0xa5 0xa8
-UPPER 0xaa 0xaf 0xb2 0xbd 0xc0 - 0xdf
-XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F'
-BLANK ' ' '\t' 0xa0
-PRINT 0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
-SWIDTH1 0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
-
-MAPLOWER <'A' - 'Z' : 'a'>
-MAPLOWER <'a' - 'z' : 'a'>
-MAPLOWER <0x80 0x90>
-MAPLOWER <0x81 0x83>
-MAPLOWER <0x83 0x83>
-MAPLOWER <0x8a 0x9a>
-MAPLOWER <0x8c - 0x8f : 0x9c>
-MAPLOWER <0x90 0x90>
-MAPLOWER <0x9a 0x9a>
-MAPLOWER <0x9c - 0x9f : 0x9c>
-MAPLOWER <0xa1 0xa2>
-MAPLOWER <0xa2 0xa2>
-MAPLOWER <0xa3 0xbc>
-MAPLOWER <0xa5 0xb4>
-MAPLOWER <0xa8 0xb8>
-MAPLOWER <0xaa 0xba>
-MAPLOWER <0xaf 0xbf>
-MAPLOWER <0xb2 0xb3>
-MAPLOWER <0xb3 - 0xb4 : 0xb3>
-MAPLOWER <0xb8 0xb8>
-MAPLOWER <0xba 0xba>
-MAPLOWER <0xbc 0xbc>
-MAPLOWER <0xbd 0xbe>
-MAPLOWER <0xbe - 0xbf : 0xbe>
-MAPLOWER <0xc0 - 0xdf : 0xe0>
-MAPLOWER <0xe0 - 0xff : 0xe0>
-
-MAPUPPER <'A' - 'Z' : 'A'>
-MAPUPPER <'a' - 'z' : 'A'>
-MAPUPPER <0x80 - 0x81 : 0x80>
-MAPUPPER <0x83 0x81>
-MAPUPPER <0x8a 0x8a>
-MAPUPPER <0x8c - 0x8f : 0x8c>
-MAPUPPER <0x90 0x80>
-MAPUPPER <0x9a 0x8a>
-MAPUPPER <0x9c - 0x9f : 0x8c>
-MAPUPPER <0xa1 0xa1>
-MAPUPPER <0xa2 0xa1>
-MAPUPPER <0xa3 0xa3>
-MAPUPPER <0xa5 0xa5>
-MAPUPPER <0xa8 0xa8>
-MAPUPPER <0xaa 0xaa>
-MAPUPPER <0xaf 0xaf>
-MAPUPPER <0xb2 0xb2>
-MAPUPPER <0xb3 0xb2>
-MAPUPPER <0xb4 0xa5>
-MAPUPPER <0xb8 0xa8>
-MAPUPPER <0xba 0xaa>
-MAPUPPER <0xbc 0xa3>
-MAPUPPER <0xbd 0xbd>
-MAPUPPER <0xbe 0xbd>
-MAPUPPER <0xbf 0xaf>
-MAPUPPER <0xc0 - 0xdf : 0xc0>
-MAPUPPER <0xe0 - 0xff : 0xc0>
-
-TODIGIT <'0' - '9' : 0>
-TODIGIT <'A' - 'F' : 10>
-TODIGIT <'a' - 'f' : 10>
+#include "charset/CP1251"
diff -urN ctype.orig/charset/CP1251 ctype/charset/CP1251
--- ctype.orig/charset/CP1251 1970-01-01 00:00:00.000000000 +0000
+++ ctype/charset/CP1251 2006-03-13 23:23:11.000000000 +0000
@@ -0,0 +1,81 @@
+/*
+ * CP-1251
+ */
+ALPHA 'A' - 'Z' 'a' - 'z'
+ALPHA 0x80 0x81 0x83 0x8a 0x8c - 0x90 0x9a 0x9c - 0x9f
+ALPHA 0xa1 - 0xa3 0xa5 0xa8 0xaa 0xaf 0xb2 - 0xb4 0xb8 0xba
+ALPHA 0xbc - 0xff
+CONTROL 0x00 - 0x1f 0x7f 0x98
+DIGIT '0' - '9'
+GRAPH 0x21 - 0x7e 0x80 - 0x97 0x99 - 0x9f 0xa1 - 0xff
+LOWER 'a' - 'z' 0x83 0x90 0x9a 0x9c - 0x9f 0xa2 0xb3 0xb4 0xb8
+LOWER 0xba 0xbc 0xbe 0xbf 0xe0 - 0xff
+PUNCT 0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e
+PUNCT 0x82 0x84 - 0x89 0x8b 0x91 - 0x97 0x99 0x9b 0xa4
+PUNCT 0xa6 0xa7 0xa9 0xab - 0xae 0xb0 0xb1 0xb5 - 0xb7 0xb9 0xbb
+SPACE 0x09 - 0x0d 0x20 0xa0
+UPPER 'A' - 'Z' 0x80 0x81 0x8a 0x8c - 0x8f 0xa1 0xa3 0xa5 0xa8
+UPPER 0xaa 0xaf 0xb2 0xbd 0xc0 - 0xdf
+XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F'
+BLANK ' ' '\t' 0xa0
+PRINT 0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
+SWIDTH1 0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
+
+MAPLOWER <'A' - 'Z' : 'a'>
+MAPLOWER <'a' - 'z' : 'a'>
+MAPLOWER <0x80 0x90>
+MAPLOWER <0x81 0x83>
+MAPLOWER <0x83 0x83>
+MAPLOWER <0x8a 0x9a>
+MAPLOWER <0x8c - 0x8f : 0x9c>
+MAPLOWER <0x90 0x90>
+MAPLOWER <0x9a 0x9a>
+MAPLOWER <0x9c - 0x9f : 0x9c>
+MAPLOWER <0xa1 0xa2>
+MAPLOWER <0xa2 0xa2>
+MAPLOWER <0xa3 0xbc>
+MAPLOWER <0xa5 0xb4>
+MAPLOWER <0xa8 0xb8>
+MAPLOWER <0xaa 0xba>
+MAPLOWER <0xaf 0xbf>
+MAPLOWER <0xb2 0xb3>
+MAPLOWER <0xb3 - 0xb4 : 0xb3>
+MAPLOWER <0xb8 0xb8>
+MAPLOWER <0xba 0xba>
+MAPLOWER <0xbc 0xbc>
+MAPLOWER <0xbd 0xbe>
+MAPLOWER <0xbe - 0xbf : 0xbe>
+MAPLOWER <0xc0 - 0xdf : 0xe0>
+MAPLOWER <0xe0 - 0xff : 0xe0>
+
+MAPUPPER <'A' - 'Z' : 'A'>
+MAPUPPER <'a' - 'z' : 'A'>
+MAPUPPER <0x80 - 0x81 : 0x80>
+MAPUPPER <0x83 0x81>
+MAPUPPER <0x8a 0x8a>
+MAPUPPER <0x8c - 0x8f : 0x8c>
+MAPUPPER <0x90 0x80>
+MAPUPPER <0x9a 0x8a>
+MAPUPPER <0x9c - 0x9f : 0x8c>
+MAPUPPER <0xa1 0xa1>
+MAPUPPER <0xa2 0xa1>
+MAPUPPER <0xa3 0xa3>
+MAPUPPER <0xa5 0xa5>
+MAPUPPER <0xa8 0xa8>
+MAPUPPER <0xaa 0xaa>
+MAPUPPER <0xaf 0xaf>
+MAPUPPER <0xb2 0xb2>
+MAPUPPER <0xb3 0xb2>
+MAPUPPER <0xb4 0xa5>
+MAPUPPER <0xb8 0xa8>
+MAPUPPER <0xba 0xaa>
+MAPUPPER <0xbc 0xa3>
+MAPUPPER <0xbd 0xbd>
+MAPUPPER <0xbe 0xbd>
+MAPUPPER <0xbf 0xaf>
+MAPUPPER <0xc0 - 0xdf : 0xc0>
+MAPUPPER <0xe0 - 0xff : 0xc0>
+
+TODIGIT <'0' - '9' : 0>
+TODIGIT <'A' - 'F' : 10>
+TODIGIT <'a' - 'f' : 10>
diff -urN ctype.orig/ru_BY.CP1251.src ctype/ru_BY.CP1251.src
--- ctype.orig/ru_BY.CP1251.src 1970-01-01 00:00:00.000000000 +0000
+++ ctype/ru_BY.CP1251.src 2006-03-14 00:15:51.000000000 +0000
@@ -0,0 +1,11 @@
+/*
+ * LOCALE_CTYPE for Russian Cyrillic character set (CP1251)
+ */
+
+ENCODING "NONE"
+VARIABLE Russian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"
diff -urN ctype.orig/ru_RU.CP1251.src ctype/ru_RU.CP1251.src
--- ctype.orig/ru_RU.CP1251.src 1970-01-01 00:00:00.000000000 +0000
+++ ctype/ru_RU.CP1251.src 2006-03-14 00:14:50.000000000 +0000
@@ -0,0 +1,10 @@
+/*
+ * LOCALE_CTYPE for Russian Cyrillic character set (CP1251), based on bg_BG.CP1251
+ */
+ENCODING "NONE"
+VARIABLE Russian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"
--
Best regards, Aleksey Cheusov.
>Unformatted:
<Please check that the above is correct for the bug being reported,>
<and append source date of snapshot, if applicable (one line).>