NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb



The following reply was made to PR standards/58601; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: matthew green <mrg%eterna23.net@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
Date: Fri, 16 Aug 2024 10:23:53 +0000

 > Date: Fri, 16 Aug 2024 18:01:14 +1000
 > from: matthew green <mrg%eterna23.net@localhost>
 > 
 > > +typedef unsigned char		char8_t;
 > 
 > could / should this check CHAR_BIT == 8 before defining?
 
 C23, Sec. 7.30 `Unicode utilities <uchar.h>', clause 3:
 
    The types declared are ...
 
 	char8_t
 
    which is an unsigned integer type used for 8-bit characters and is
    the same type as unsigned char; ...
 
 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf#page=426
 
 This is independent of whether CHAR_BIT is exactly 8 or larger.
 
 Note that char16_t and char32_t may be wider than 16 or 32 bits,
 respectively -- they are specified to be uint_least16_t and
 uint_least32_t, not uint16_t and uint32_t.
 
 > i like how there's a lot of tests that someone else wrote :)
 
 Should maybe add some more c8rtomb and mbrtoc8 tests -- I just adapted
 the ones that I found in FreeBSD for c16rtomb and mbrtoc16, but it
 doesn't exercise the full range of possible invalid UTF-8 byte shapes
 or UTF-8 byte sequence lengths or forbidden redundant encodings.
 


Home | Main Index | Thread Index | Old Index