Subject: DELAY on i386 change?
To: None <perry@piermont.com>
From: Gordon W. Ross <gwr@netbsd.org>
List: tech-kern
Date: 03/28/1999 22:02:35
Another useful trick is to take advantage of the fact that delay(n)
is never called with values of n larger than about 5 million, so it
can be implemented as a macro that passes some multiple of uSec. to
the actual delay function.  For example:

	/* param.h */
	#define	DELAY(n)	delay(n)
	#define delay(us)	_delay((us)<<8)
	extern void _delay __P((unsigned));

	/* machdep.c? */
	int delay_divisor=25; /* tuned at boot */
	void _delay(usecX256)
	{
	  while (usecX256 > 0)
	    usecX256 = (volatile) usecX256 - delay_divisor;
	}

With something like the above, the shortest possible delay is
nearly the same as the time for a function call/return.

(The above is how I dealt with this on the sun3.  I didn't bother
to actually dynamically tune the delay_divisor,  but on the sun3
the clock rates and delay_divisor values are all well known.)

Perry E. Metzger writes:
 > 
 > Right now, DELAY() on port-i386 loses badly on delays less than
 > 5usec. (To see why, look at the code -- 5 is a magic constant.)
 > 
 > I was bitten badly by this when trying to fix a bug in the line
 > printer driver a couple of days ago.
 > 
 > I'm proposing (no, not for 1.4 -- for post 1.4) that we
 > 
 > 1) Calibrate a loop timer on bootup, and
 > 2) use it (thusly) in DELAY() for low values
 > 
 > #define DELAY(x) {							   \
 > 			volatile int _i;				   \
 > 			if ((x) < 10)					   \
 > 				for (_i = 0; _i < delay_table[(x)]; _i++); \
 > 			else						   \
 > 				delay(x);				   \
 > }
 > 
 > 
 > (Note that delay_table[x] could just be loops_per_usec*x or some such
 > -- I just thought of a table lookup because it would be faster on
 > genuine ancient i386es and such, but it is probably silly. Naturally,
 > either the table or the loops_per_usec variables are calibrated
 > during boot.)
 > 
 > Note that if x is a constant, the optimizer will get rid of the if for 
 > you. Most of the time, the x IS a constant, so this is no greater
 > overhead for the majority of DELAY() calls.
 > 
 > Why, you may ask, call delay() instead of always using the loop?
 > Because laptop processors change speed in many laptops depending on
 > power conditions. Right now, very short delays are completely broken,
 > so this causes no additional harm on them and actually helps, but
 > delay() works regardless of clock speed and I'd rather not break
 > longer delays.
 > 
 > BTW, this code has the advantage that it actually *can*, in most
 > cases, give you accurate delays of between 1 and 9 usec if the clock
 > stays constant. You avoid the procedure call overhead that might cause 
 > you pain on very slow machines, and it should still be reasonably
 > accurate on fast ones.
 > 
 > Comments, anyone?
 > 
 > Perry