tech-misc: Re: a new KNF (and some comments)

Subject: Re: a new KNF (and some comments)
To: None <lukem@cs.rmit.edu.au>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: tech-misc
Date: 01/20/2000 17:26:28
On Fri, 21 Jan 2000 11:50:29 +1100 
 Luke Mewburn <lukem@cs.rmit.edu.au> wrote:

 > Well, there seems to be considerable debate on what the KNF should be.
 > Some people seem to believe that we should totally revise the coding
 > style, some want it updated to ANSI, some don't want to change at all.
 > 
 > Here's what I propose:

[snip]

Luke, I really appreciate your taking the time to do that.  I understand
that it's a pain when dealing with coding style, because it really is
like the Crusades for some people (including myself; I'm glad I agree
with most of this stuff :-)

I do have a few change requests, tho...

 > #include <sys/cdefs.h>
 > #ifndef lint
 > __COPYRIGHT("@(#) Copyright (c) 1999\n\
 > 	The NetBSD Foundation, inc. All rights reserved.\n");
 > __RCSID("$NetBSD$");
 > #endif /* not lint */

Actually, I think the __COPYRIGHT() and __RCSID() should be changed in the
header files to provide their own ;'s, and the source and guide updated
appropriately.  This way if you conditionally compile out the RCS IDs, the
compiler won't choke on you (nor will lint(1)).

 > /*
 >  * ANSI function declarations for private functions (i.e. functions not used
 >  * elsewhere) go at the top of the source module.  Only the kernel has a name
 >  * associated with the types.  I.e. in the kernel use:
 >  *	void function(int a);
 >  * in user-land use:
 >  *	void function(int);

Please nuke the "kernel version" of that.  As I recently discovered, this
can have somewhat annoying side-effects if you're scanning a tree with
e.g. id-utils for an instance of a global variable, and function prototypes
have an argument in them of the same name.

 > /*
 >  * Macros are capitalized, parenthesized, and should avoid side-effects.
 >  * If they are an inline expansion of a function, the function is defined
 >  * all in lowercase, the macro has the same name all in uppercase.  If the
 >  * macro needs more than a single statement, use do { ... } while (0), so
 >  * that * a trailing semicolon works.  Right-justify the backslashes; it
 >  * makes it easier to read.
 >  */
 > #define	MACRO(x, y)	do {						\
 > 	variable = (x) + (y);						\
 > 	(y) += 2;							\
 > } while (0)

Could we change that to read:

#define	MACRO(x, y)							\
do {									\
	variable = (x) + (y);						\
	(y) += 2;							\
} while (0)

I find it a little easier to quickly find beginnings and ends of macro
bodies that way... and it makes it a bit more consistent in the presence
of widely-varying macro name lengths.

 > /* Enum types are capitalized. */
 > enum enumtype {
 > 	ONE,
 > 	TWO
 > } et;
 > 
 > /*
 >  * When declaring variables in structures, declare them sorted by use, then
 >  * by size, and then by alphabetical order.  The first category normally
 >  * doesn't apply, but there are exceptions.
 >  *	XXX: change the above
 >  * Each variable gets its own line.
 >  * Attempt to line-up the entries, using appropriate tabs.

I think we should also put a blurb in here about bitfields.  I find it
hard to keep track of the following:

struct foo {
	u_int8_t	bar:1;
	u_int8_t	baz:5;
	u_int8_t	zap:2;
};

I think that should be written as:

struct foo {
	u_int8_t	bar:1,
			baz:5,
			zap:2;
};

...to drive home that they're all part of the same byte.

 > 	while ((ch = getopt(argc, argv, "abn")) != -1)

...in the case of large while(), if(), etc. bodies which technically don't
require { }'s, I personally like to add them for clarity.  I generally define
"large" as "more than a few lines".

 > 		switch (ch) {		/* Indent the switch. */
 > 		case 'a':		/* Don't indent the case. */
 > 			aflag = 1;
 > 			/* FALLTHROUGH */
 > 		case 'b':
 > 			bflag = 1;
 > 			break;
 > 		case 'n':
 > 			num = strtol(optarg, &ep, 10);
 > 			if (num <= 0 || *ep != '\0')
 > 				errx(1, "illegal number -- %s", optarg);
 > 			break;
 > 		case '?':
 > 		default:
 > 			usage();
 > 			/* NOTREACHED */
 > 		}
 > 	argc -= optind;
 > 	argv += optind;
 > 
 > 	/*
 > 	 * Space after keywords (while, for, return, switch).  No braces are
 > 	 * used for control statements with zero or only a single statement.
 > 	 *
 > 	 * Forever loops are done with for's, not while's.
 > 	 */
 > 	for (p = buf; *p != '\0'; ++p);

I've found bugs related to extra ;'s at the end of statements like this.
How about:

	for (p = buf; *p != '\0'; ++p)
		/* nothing */ ;

?  For clarity.

 > 	/* No spaces after function names. */
 > 	if (error = function(a1, a2))
 > 		exit(error);

That should read:

	if ((error = function(a1, a2)))

but I personally prefer:

	if ((error = function(a1, a2)) != 0)

...for additional clarity.

Unfortunately, not all of these suggestions can be implemented with
indent(1).  However, for new code, I think these few changes will help
code clarity quite a bit; I certainly find them helpful in my own code.

        -- Jason R. Thorpe <thorpej@nas.nasa.gov>