Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: isspace() behaviour



    Date:        Mon, 17 Feb 2025 23:20:31 -0500
    From:        Tom Lane <tgl%sss.pgh.pa.us@localhost>
    Message-ID:  <1164928.1739852431%sss.pgh.pa.us@localhost>

  | Well, if we're digging up ancient history ...

Indeed.   And ditto.

  | It's difficult to be sure
  | whether they were expected to behave sanely for EOF,

K&R might not have been clear, but the 7th edition ctype.3 man page
certainly is.   It states:

	These macros classify ASCII-coded integer values
	by table lookup.
	Each is a predicate returning nonzero for true,
	zero for false.
	.I Isascii
	is defined on all integer values; the rest
	are defined only where 
	.I isascii
	is true and on the single non-ASCII value
	EOF (see
	.IR stdio (3)).

To put that into context, 6th edition and earlier had nothing
equivalent, as far as the world outside Bell Labs was concerned,
<ctype.h> and everything associated with it first appeared in
the 7th edition.   First edition K&R (slightly) predates the
release of the 7th edition (they were close, in historic terms anyway).

For what it is worth, the definition of isascii() was

	#define isascii(c)	((unsigned)(c) <= 0177)

All the rest of the is*() macros used:
		((_ctype + 1)[c] & _WHATEVER)
where _ctype was a 129 byte array, where the [0] entry was 0
(that one, once the +1 is applied, maps to EOF) and all the rest
contained whatever set bits applied to the 7 bit value in question.

And for completeness, toupper() and tolower() did simple arithmetic
(with no tests involved) upon the value of the parameter.

Correct code required

	if (isascii(c) && isupper(c))
		c = tolower(c);

or logic which produced the equivalent effect.   The param 'c' could
be any integer type (the "isascii()" call guarantees that it is a 7
bit positive value (hence not EOF).   No need to test for EOF in the
above, as isupper(EOF) is false (isanything(EOF) is false) but in
cases where only EOF or ascii chars were possible, the isascii() test
could be dropped, and things still worked.

kre

ps: I would not attempt to apply those rules to modern programming,
but it certainly explains their history, including how we got to what
we have now, and various other folklore that has been attached over
the ages.




Home | Main Index | Thread Index | Old Index