Johnny Billquist <bqt%update.uu.se@localhost> writes: >> See CAVEATS in ctype(3). > > Right. But is gcc really smart enough to understand at compile time if > something else than -1 is the negative value, and that toupper in fact > is more limited than what the signature says? > > The *signature* of the function is int toupper(int). If you pass a > char to that, I can't see that there would ever be a warning about any > problems. The signature is that it takes an int, but the specification is that if the value of the int is other than EOF or something representable as unsigned char (projecting to he implementation, meaning -1 is ok and 0..255 is ok), then you get UB. The point of UB in specifications is to admit efficient implementations (or to end arguments about existing implementations). The NetBSD implementation uses the value to index an array. So if you have char c, d; c = getchar(); d = toupper(c); then it is entirely possible to cause an out-of-bounds read. The warning is coming from the code in the implementation of ctype -- which is a macro -- that does this read. In reading the code that leads to the warning, it is easy to conclude by inspection that the calling code is incorrect. > The fact that toupper in fact is only defined for -1..255 is hardly > visible in the signature of the function. True but I don't tnink that has much to do with this. We have a conforming implementation, and the compiler is pointing out a different kind of UB that results from the function being invoked in a way that causes UB. To follow up, adding -Wsystem-headers brought the warning back. clang also warns. To answer Greg Woods, It is not "conservative" to silently accept UB-causing inputs and turn them into UB array reads. Another approach would be for NetBSD to adust the code to bounds check before indexing and do something if the value is out of bounds. Options are returning false, returning a random bit, calling abort(3), trying to send all of the user's files to a random IP addreess, and so on. All are allowed by the spec :-) Probably returning false with an envvar option to abort is best, similar to how UB from mutex ops is handled. When we started with abort it seemed there was a vast amount of incorrect code written and debugged on other operating systems that allow code specified to hav UB.
Attachment:
signature.asc
Description: PGP signature