At Sun, 09 Feb 2025 20:30:13 -0500, Brad Spencer <brad%anduin.eldar.org@localhost> wrote: Subject: Re: ntpd looses sync on domU > > Emmanuel Dreyfus <manu%netbsd.org@localhost> writes: > > > The setup is Xen 4.1.8 (xenkernel418-20240909nb1 from pkgsrc), with > > NetBSD-10.0/amd64 dom0 and domU. > > > > From time to time, I experience time keeping trouble on a small set > > of domUs. The clock drifts a lot, loosing hours in a few hours. ntpd > > is unable to cope, and the only remeidiation is to reboot the domU. > > Oddly, the dom0 and other domU running on the same dom0 have no clock > > trouble. > > I recently discovered that switching kern.timecounter.hardware from > > xen_system_time to clockinterrupt helped a lot. The drift remain > > but is much smaller, and ntpd is able to keep an almost correct time. xen_system_time is supposed to come from a reliable emulation of the TSC register running at a well known and fixed 1 GHz. I would guess clockinterrupt comes from the Xen emulation of the i8254 clock, and that it may works more reliably? Maybe that's because it uses the Xen Platform Timer, which on systems older CPUs and less reliable TSC means using HPET. I should try this too, but it's not feasible for dom0, and I see the same timekeeping problems in dom0 as well. On the other hand a domU using clockinterrupt that kept good time while the dom0 drifted back in time would be an interesting data point! > > Anyone already experienced that? Yes, but only on some hardware, and so far only with Xen 4.18 and newer. My reports are scattered in the list archives under various subjects including most recently "now running Xen 4.20-rc in test" and various older threads with either "timkeeping" or "timecounter" in the subject. This past week I've been confirming that the problem does not exist in Xen 4.11 or older, and I don't think it exists in Xen 4.13 either. I've not (yet) tried 4.15 or anything else in between. So far as I can tell there's no reason to believe what I'm seeing is really any different from what anyone else is reporting. From what I can tell now it seems the problem is the Xen kernel loses track of the adjustment factor needed to attain 1 GHz. > When I moved to PVH DOMUs the problem with ntpd for me went away. There could be some clues there, but many more details are needed! -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpadMrJdvqUr.pgp
Description: OpenPGP Digital Signature