Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ntpd looses sync on domU



At Sun, 09 Feb 2025 20:30:13 -0500, Brad Spencer <brad%anduin.eldar.org@localhost> wrote:
Subject: Re: ntpd looses sync on domU
> 
> Emmanuel Dreyfus <manu%netbsd.org@localhost> writes:
> 
> > The setup is Xen 4.1.8 (xenkernel418-20240909nb1 from pkgsrc), with 
> > NetBSD-10.0/amd64 dom0 and domU.
> >
> > From time to time, I experience time keeping trouble on a small set
> > of domUs. The clock drifts a lot, loosing hours in a few hours. ntpd
> > is unable to cope, and the only remeidiation is to reboot the domU. 
> > Oddly, the dom0 and other domU running on the same dom0 have no clock
> > trouble.

> > I recently discovered that switching kern.timecounter.hardware from 
> > xen_system_time to clockinterrupt helped a lot. The drift remain
> > but is much smaller, and ntpd is able to keep an almost correct time.

xen_system_time is supposed to come from a reliable emulation of the TSC
register running at a well known and fixed 1 GHz.

I would guess clockinterrupt comes from the Xen emulation of the i8254
clock, and that it may works more reliably?  Maybe that's because it
uses the Xen Platform Timer, which on systems older CPUs and less
reliable TSC means using HPET.

I should try this too, but it's not feasible for dom0, and I see the
same timekeeping problems in dom0 as well.

On the other hand a domU using clockinterrupt that kept good time while
the dom0 drifted back in time would be an interesting data point!

> > Anyone already experienced that? 

Yes, but only on some hardware, and so far only with Xen 4.18 and newer.

My reports are scattered in the list archives under various subjects
including most recently "now running Xen 4.20-rc in test" and various
older threads with either "timkeeping" or "timecounter" in the subject.

This past week I've been confirming that the problem does not exist in
Xen 4.11 or older, and I don't think it exists in Xen 4.13 either.  I've
not (yet) tried 4.15 or anything else in between.

So far as I can tell there's no reason to believe what I'm seeing is
really any different from what anyone else is reporting.

From what I can tell now it seems the problem is the Xen kernel loses
track of the adjustment factor needed to attain 1 GHz.

> When I moved to PVH DOMUs the problem with ntpd for me went away.

There could be some clues there, but many more details are needed!

-- 
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgpadMrJdvqUr.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index