NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
The following reply was made to PR standards/58601; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: matthew green <mrg%eterna23.net@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
Date: Fri, 16 Aug 2024 10:23:53 +0000
> Date: Fri, 16 Aug 2024 18:01:14 +1000
> from: matthew green <mrg%eterna23.net@localhost>
>
> > +typedef unsigned char char8_t;
>
> could / should this check CHAR_BIT == 8 before defining?
C23, Sec. 7.30 `Unicode utilities <uchar.h>', clause 3:
The types declared are ...
char8_t
which is an unsigned integer type used for 8-bit characters and is
the same type as unsigned char; ...
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf#page=426
This is independent of whether CHAR_BIT is exactly 8 or larger.
Note that char16_t and char32_t may be wider than 16 or 32 bits,
respectively -- they are specified to be uint_least16_t and
uint_least32_t, not uint16_t and uint32_t.
> i like how there's a lot of tests that someone else wrote :)
Should maybe add some more c8rtomb and mbrtoc8 tests -- I just adapted
the ones that I found in FreeBSD for c16rtomb and mbrtoc16, but it
doesn't exercise the full range of possible invalid UTF-8 byte shapes
or UTF-8 byte sequence lengths or forbidden redundant encodings.
Home |
Main Index |
Thread Index |
Old Index