Re: Proposal: _ctype_ table bitwidth change

To: tech-userlevel%NetBSD.org@localhost
Subject: Re: Proposal: _ctype_ table bitwidth change
From: "T.SHIOZAKI" <tshiozak%bsdclub.org@localhost>
Date: Thu, 24 Mar 2011 03:28:46 +0900 (JST)

> > The most important point is that is* functions accept an octet, not a
> > code point.
> 
> They do?  Where is this defined?
> 
> Historically, it has been false: is*() has been documented to accept
> "characters", which I can't read as anything but codepoints.
> 
> That some charsets have some codepoints that can't fit in unsigned char
> (at least when, as on NetBSD, unsigned char is just one octet) just
> means that is*() aren't useful for more than just 256 of their possible
> codepoints, not that they somehow get retconned to take just one octet
> of a storage encoding of a codepoint.
> 
> At least, that's how I read it.  Is there a spec somewhere which spells
> this out precisely?

As far as I know, there is no explicit description.

However, to begin with, ISO C doesn't define the concept of like "codepoint."
It defines only two representation; "(single-byte/multibyte) character" and
"wide character".
I wonder how is* functions are affected by undefined concept.

In addition, ISO C contains the part implying that is* functions accept
an "octet".

7.25.2.1 Wide character classification functions:

  Each of the following functions (note: isw* functions) returns true
  for each wide character that corresponds (as if by a call to the wctob
  function) to a single-byte character for which the corresponding
  character classification function (note: is* functions) from 7.4.1
  returns true, except that the iswgraph and iswpunct functions may
  differ with respect to wide characters other than L' ' that are both
  printing and white-space wide characters.

  ('note' is inserted by me.)

Note that this part was added at revision in 1995 (C95).
ISO C seems to contain some ambiguity about "character,"
especially in the part that has been existing since 1989 (C89).


---
Takuya SHIOZAKI

References:
- Re: Proposal: _ctype_ table bitwidth change
  - From: Joerg Sonnenberger
- Re: Proposal: _ctype_ table bitwidth change
  - From: T.SHIOZAKI
- Re: Proposal: _ctype_ table bitwidth change
  - From: der Mouse

Prev by Date: Re: Proposal: _ctype_ table bitwidth change
Next by Date: Re: Proposal: _ctype_ table bitwidth change
Previous by Thread: Re: Proposal: _ctype_ table bitwidth change
Next by Thread: Re: Proposal: _ctype_ table bitwidth change
Indexes:

Home | Main Index | Thread Index | Old Index