tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: using the interfaces in ctype.h
On 21-Apr-08, at 5:56 AM, Alan Barrett wrote:
On Sun, 20 Apr 2008, Greg A. Woods; Planix, Inc. wrote:
Indeed! Some implementations were so lame they didn't include the
mask in the implementation of the macro!
If the implementation masked the value before using it, then it
would be
unable to distinguish EOF from UCHAR_MAX (typically '\377').
Indeed, however the current implementation doesn't even try to
"detect" or "distinguish" EOF, and indeed passing EOF without casting
it properly and/or masking will result in an out-of-bounds array
access in the current implementation.
Anyway,
the onus falls on the caller to ensure that they don't pass invalid
values; otherwise the implementation is allowed to do anything at all.
I certainly agree as a matter of principle and for the most minimal
correctness.
However I would hope that it is in the best interests of NetBSD to
provide a _safe_ implementation, not just a minimally correct one.
I'm not sure what the implications of an out-of-bounds array access
are in most of these cases, though apparently it can sometimes cause
at least an unexpected abort, and in general I would call that "unsafe".
Simple testing (example code upon request) shows that proper masking
(or proper casting) of the macro parameter within its use as an array
index will prevent any out-of-bounds access.
Meanwhile the compile-time warnings introduced by the current "do
nothing special" implementation are useless (i.e. not triggered)
whenever the parameter is a signed integer (even "signed char") which
could have the value "-1" and thus trigger an out-of-bounds array
access.
Oh oh, oops, the NetBSD implementations don't seem to include the
mask
either! I didn't realized that! So sad. (Which may even mean they
violate the standards implication that they be able to safely accept
the value of EOF.
Huh? The NETBSD implementations accept EOF.
Ok, yes, but only with "undefined" behaviour.
Since masking inside the
implementation would violate the requirement to distinguish EOF from
UCHAR_MAX, it's good that NetBSD doesn't do that.
Huh? That makes no sense whatsoever.
What do you think the current NetBSD implementation does when given
EOF anyway? How about when it's given a signed integer variable that
has been assigned the value of EOF? How about any other negative
number which some user might have thought to be a useful way of
extending the error reporting possible in such a situation?
FreeBSD, OpenBSD, and Darwin all seem to have much better
implementations, though they are all using proper (inline) functions
which makes it easier in some ways to do it right.)
I am mildly curious. In what way are they "better"?
Well they can't as easily be responsible for causing a program to
crash, for example.
With GCC on any NetBSD architecture there are only two correct ways to
access an array of fewer than UINT_MAX bytes (eg. one of UCHAR_MAX
bytes) using a macro:
Given:
int ctype[UCHAR_MAX] = { 0 };
Either:
#define _ctype(i) ctype[((i) & 0xFF)]
Or:
#define _ctype(i) ctype[(unsigned char) i]
I recommend the following slightly more portable technique for ctype.h:
#define _CTYPE_MASK ~(UINT_MAX << CHAR_BIT)
#define isdigit(c) ((int)(_ctype_ + 1)[((c) & _CTYPE_MASK)] & _N))
The only problem here is the slightly confusing warning (if warnings
are enabled) (though not much more confusing than the current one
given for parameters of type "char") when a negative integer constant
(such as EOF) is explicitly passed to one of these macros.
On the other hand there may be some merit in adopting the OpenBSD or
FreeBSD implementations, though I don't yet understand their
implications in face of the NetBSD way of doing wchar_t et al.
--
Greg A. Woods; Planix, Inc.
<woods%planix.ca@localhost>
Home |
Main Index |
Thread Index |
Old Index