tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Proposal: _ctype_ table bitwidth change
hi,
> Changing the table to 16 bits wide breaks binary compatibility.
i don't break any binary compatibility, why do you think so?
i reserve old 8bit _ctype_ table for libc12.
(it may be removed from libc13 major bump by __LIBC12_SOURCE__ macro)
> To get extra bits you really need to add a second 257 byte table.
> This might mean replicating some bits in both tables and/or looking
> in both tables.
your idea is:
extern unsigned char *_ctype_;
extern unsigned char *_ctype_extra_bit_;
#define isalpha(c) (_ctype_[c] & _ALPHA)
...
#define isblank(c) (_ctype_extra_bit_[c] & _BLANK)
isn't it?
but current _ctype_'s bitmask pattern(include/sys/ctype_bits.h) is
quite strange.
at this point, i and joerg have same opinion(i think), we would like to replace
more sane (such as _RUNETYPE_*) bitmask pattern.
so we need one more 8bit table:
#ifdef __LIBC12_SOURCE__
extern unsigned char *_ctype_; /* backward compatibility */
#endif
extern unsigned char *_ctype_new_abi1_;
extern unsigned char *_ctype_new_abi2_;
#define isalpha(c) (_ctype_new_abi1_[c] & _ALPHA)
...
#define isblank(c) (_ctype_new_abi2_[c] & _BLANK)
how confusing.
my idea is simple:
#ifdef __LIBC12_SOURCE__
extern unsigned char *_ctype_; /* backward compatibility */
#endif
extern unsigned short *_ctype_new_abi_;
#define isalpha(c) (_ctype_new_abi_[c] & _ALPHA)
...
#define isblank(c) (_ctype_new_abi_[c] & _BLANK)
> As an aside, a lot of code use the isxxx() and then make the assumption
> that they have checked the 'standard' character set, not some random
> dataset than depends on the locale (or any other system/program state).
these assumption is completely wrong, is* func affected by current locale.
The isalpha() function shall test whether c is a character of class alpha
in the program's current locale
http://pubs.opengroup.org/onlinepubs/009695399/functions/isalpha.html
but posix defined Portable Character Set(similar to ISO646).
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap06.html
all the encoding supported by locale must be the superset of Portable
Character Set.
so you don't worry about in many case.
very truly yours.
--
Takehiko NOZAKI<takehiko.nozaki%gmail.com@localhost>
Home |
Main Index |
Thread Index |
Old Index