tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: using the interfaces in ctype.h
At 10:18 PM 4/20/2008 -0400, der Mouse wrote:
> There are two cases one has to consider for use of isspace() etc.
> 1) if x is an int (wider than a char), and is the result of getc(),
> then x will be in the range [-1, UCHAR_MAX].
No; x will be in the range [0..UCHAR_MAX] or be EOF, which latter
happens to be -1 in our implementation but may be different. (Unless
you were speaking from a strictly NetBSD perspective, rather than a
correct-use-of-ctype perspective. Your mention of 1's-complement
machines makes me think not.)
Thank you for pointing that out. I apologize. I had overlooked that
EOF is a negative integer, but is not required to be -1.
However, I think that it's still true to say that if x is EOF,
isprint(x & UCHAR_MAX) will not (generally) be the same as
isprint(x), even though isprint(x & UCHAR_MAX) is always valid. This
was my point. (My point was not that (x & UCHAR_MAX) has any
particular value.)
I am (per C99) assuming that UCHAR_MAX is one less than a power of
two, so that x & UCHAR_MAX is valid and equivalent to (x % (UCHAR_MAX+1)).
So both cases still apply, I think.
> The phrase
> .. if (isspace((unsigned char) buf[0])) ...
> won't work if isspace() is in-line and there's not enough casting in
> the macro.
I can't see how it could fail. Could you give an example?
It will fail by generating the warning which prevents compilation
with -Werror on some machines. See Greg's other messages -- that's
what started this discussion. (Apparently there are some compilers
that complain about indexing using (unsigned chars) -- probably those
machines on which char is identical to unsigned char, but I'm guessing.)
> I'm running 3.1, so I may have the wrong header files; but this would
> imply that (for example) isspace() should change from
> ((int)((_ctype_ + 1)[(c)] & _S)
> to
> ((int)((_ctype_ + 1)[(int)(c)] & _S)
I think this would be a very bad idea. The existing code draws
warnings from some compiler versions about "array subscript has type
char", which let a coder catch such sloppy code; while this doesn't
apply to 3.1's compiler in my experience, doing it for 3.1 leads to the
idea of doing it for later versions, for which it *does* matter.
It happens for 3.1 and gcc for x86. My point was that I don't have
the 4.0 or more recent header files to hand. My point also was that
this makes the <ctypeh.h> is...() macros formally at least
inconsistent with the C99 definitions (which require int, and for
which a (char) argument will silently be widened).
I think we can agree (by looking at C99) that the standard definition
of isspace() is 'int isspace(int)'. NetBSD's definition of macros is
convenient, but is not mandated by the standard (in fact, the
standard does not give special discussion to any of the <ctype.h>
functions if implemented as macros).
I can't find a place where C99 requires that any implementation of a
function-like macro for a library function be "warning-equivalent" to
calling the library function. In other words, C99 does not require
that isspace(x) be "warning-equivalent" to (isspace)(x). But I
happen to think that it's in the spirit of the specification for
isspace(x), even though I agree that doing so may be
inconvenient. However, it's more portable, because (isspace)(x) is
not likely to give a warning -- and if it does, the warning will be
much more like what Coverity might give, e.g. "x is not in {EOF,
0..UCHAR_MAX}", rather than the rather inscrutable gcc message.
As far as I can tell, ultimately it comes down to an implementation
choice, as C99 does not give clear guidance.
--Terry
Home |
Main Index |
Thread Index |
Old Index