On Wed, 13 Dec 2023, Johnny Billquist wrote:
When running 1.4T on my VAX emulator, I've noticed clock drift. One of
my "I want to do someday" things is to figure out what's going on with
that. (And that's 1.4T, too, not "these days".)
Do you know if that was drift affected by load, or just simple drift?
Because if you run a VAX without something like ntp, you will *always* have
some drift. The clock interrupt on the VAX isn't very precise. It's running at
100 Hz, but the error is commonly a couple of percent, meaning over a day you
can easily have a drift of a minute or two.
Ntpd is able to compensate for systematic clock drift, but not for random
one, which makes ntpd eventually quit. With lizzie, which is KA46, I have
always observed ntpd giving up and then the clock drifting away. This is
with:
lizzie$ uname -a
NetBSD lizzie 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 2020 mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/vax/compile/GENERIC vax
lizzie$
and I now have:
lizzie$ uptime
10:53PM up 107 days, 10:17, 2 users, load averages: 0.02, 0.12, 0.07
lizzie$ date
Wed Dec 13 22:53:45 UTC 2023
lizzie$
while the correct time is:
angie$ date
Wed 13 Dec 21:48:49 UTC 2023
angie$
so the clock has been running fast, not the usual sign of lost interrupts.
Now lizzie has been sitting mostly idle since I last ran GCC regression
testing in mid Oct, except for occasional malicious network connection
attempts which do take some processing power of the venerable machine.
I can resynchronise the clock and rerun ntpd and see if the daemon
survives while the machine is idle say until tomorrow.
What is notable however it is a particularly high clock jitter reported
with lizzie, which is unlike with my various other systems, including
similarly old and slow ones such as a KN03 DECstation MIPS machine, a
486DX2 PC machine or a dual P5-MMX PC machine.
So for lizzie the jitter jumps between ~50 and ~150 while with the P5-MMX
it is below 10, with the 486DX2 it is below 0.5 and with the KN03 it is
below 0.1 even.
High jitter may legitimately happen with systems that rely solely on the
timer interrupt for timekeeping and consequently have a very coarse time
resolution, because the accuracy of the time returned by system facilities
such as gettimeofday(2) or ntp_gettime(2) will then depend on how much
time has passed since the last timer interrupt tick when the call is made.
As I recall the KA46 does have a high-precision timer though, so it seems
like there is something fishy going on here: either the calculation of the
fractional part of timer interrupt ticks isn't right or the latency of the
clock retrieval system facilities is highly variable for some reason.
Maciej