At Wed, 06 Mar 2024 04:25:45 +0000, "Mathew, Cherry G.*" <c%bow.st@localhost> wrote: Subject: Re: timekeeping regression? > > >>>>> Greg A Woods <woods%planix.ca@localhost> writes: > > > At Sun, 18 Feb 2024 01:40:52 -0800, "Greg A. Woods" <woods%planix.ca@localhost> wrote: > > Subject: Re: timekeeping regression? > >> > >> Still looking for a Round Tuit to do the investigation into why > >> "clocksource=tsc" isn't taking effect, and it'll have to wait a > >> couple more weeks now, so if anyone else beats me to it..... > > > I did some code-reading and added some printk's to the Xen kernel > > and discovered the reason "clocksource=tsc" didn't work is because > > none of my Xen machines have X86_FEATURE_TSC_RELIABLE, and when I > > faked it the "warp" detection check invalidated it anyway. > > > I think I just discovered the difference between the "good" and > > "bad" machines. The "good" ones both still had > > "dom0_vcpus_pin=true". > > > I'll reboot the bad one with that added again soon and see how it > > does. > > This is interesting. I imagine that the current timecounter(9) MD code > doesn't factor in the backing hardware physical CPU being yanked from > under it, assuming that it then relies on the TSC from it, for > timekeeping. > > Did you say that this is only relevant for "clocksource=tsc" again ? No, I'm saying "clocksource=tsc" has no effect whatsoever on any of the machines I have. The CPUs are Xenon 54xx and 56xx, and Xen finds their TSC registers can "warp" backwards in time (i.e. they really are unreliable), so Xen refuses to use them as its platform timer even if I hack the code to fake a TSC_RELIABLE id bit. TSC warp detected, disabling TSC_RELIABLE So the platform timer stays as HPET. Platform timer is 14.318MHz HPET However that only seems to work reliably when the dom0 CPUs are "pinned", or maybe if there's only one dom0 CPU. Here I'm running all dom0's with multiple CPUs (normally 2, but up to 8 in one case). Normally I had always used "dom0_vcpus_pin=true", but I had removed it on the one machine following what turns out to be incomplete advice in the NetBSD Xen HowTo about this option. Since I'm typing this mail on a VM of the "bad" (not-pinned) machine I'll reboot it after it is sent. I suspect the problem might be in how the dom0 timecounter is sourced from the Xen kernel, but I don't know the code and I don't know how or why not being pinned to a pCPU might affect it. Note, I found some old discussion related to the origin of the dom0_vcpus_pin option that suggested it was necessary to allow dom0 vCPUs to actually be pinned to pCPUs and in effect to prevent the Xen kernel from trying to do CPU clock scaling (on those pCPUs), but I don't think clock scaling is even possible in the first place on any of the CPUs I have running Xen. -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgp4taSzaUsrm.pgp
Description: OpenPGP Digital Signature