tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: using the interfaces in ctype.h
On 21-Apr-08, at 2:07 PM, der Mouse wrote:
That makes about as much sense as saying that getc assuming its FILE *
parameter is non-nil is not safe, or that mktemp is not safe when
passed (void *)&main. Large swaths of libc will do odd things when
passed arguments beyond their interface specifications; it is no part
of libc's mandate to detect such calls. I'm not convinced "not safe"
is really a fair term to apply to such behaviour. I certainly don't
see it as worth significant effort to do anything in particular in
such
cases.
The problem here, I think, is that it's an array access that's for all
intents and purposes done within the application, not within libc.
This makes compiler warnings about it, and perhaps crashes caused by
it, somewhat more confusing for at least some applications programmers.
Witness the start of this thread where it was suggested that the
caller of the "ctype" APIs kill the ability of the compiler to detect
incorrect usage while at the same time preventing the ability of the
implementation to distinguish between EOF and 0xFF.
I don't think anyone has suggested application-level casting for cases
where the argument might be EOF; that would be severely broken - which
is why the implementation, which must be prepared to handle EOF
arguments, should not do it (nor anything equivalent).
I take it that such a suggestion was exactly the case: "cast ctype
arguments to unsigned char and not int"
Either way you look at it, (and no matter where it is), the cast will
make EOF==0xFF.
I'm saying that it doesn't matter and that it's better to have the
implementation provide an implementation-specific and correct mask
instead of relying on the application programmer to get it right. The
compiler warning everyone wants doesn't really help all that much (50%
or less of cases I'd say), and forcing application programmers to do
the cast means it's just as likely going to be wrong and/or useless.
Application programmers should be strongly encouraged to use the API
as it is defined and they should not be encouraged to do funny things
just to shut up the compiler. If the cast is done within the
implementation then the macro looks and behaves much more like a full-
fledged function implementation would (or should), even if it still
gets it wrong when the programmer does end up passing EOF by mistake.
Besides, the standards don't, so far as I can tell, require
implementations to always return zero for all the is*() APIs when EOF
is passed to them. This whole "the mask prevents the implementation
from distinguishing between 0xFF and EOF" claim is completely bogus.
It just doesn't matter what these functions return when passed EOF --
their result in that case is undefined anyway. They are only required
to accept EOF because it is and was common practice to directly pass
the un-modified result of something like getc() to them. The program
is going to spin in a loop if it doesn't detect EOF anyway, regardless
of whether the programmer casts the value to (unsigned char) or not
and regardless of whether the implementation masks it to prevent an
out-of-bounds array access.
I guess the on additional point I should make is that the use of
"(_ctype_ + 1)" as the start of the array can be done away with if the
index is masked by ~(~0 << CHAR_BIT). That might even speed up these
macros, at least on machines which can do the mask faster than they
can add one to a pointer. :-)
--
Greg A. Woods; Planix, Inc.
<woods%planix.ca@localhost>
Home |
Main Index |
Thread Index |
Old Index