Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: timekeeping regression?



>>>>> Greg A Woods <woods%planix.ca@localhost> writes:

    > At Wed, 06 Mar 2024 04:25:45 +0000, "Mathew, Cherry G.*" <c%bow.st@localhost> wrote:
    > Subject: Re: timekeeping regression?
    >> 
    >> >>>>> Greg A Woods <woods%planix.ca@localhost> writes:
    >> 
    >> > At Sun, 18 Feb 2024 01:40:52 -0800, "Greg A. Woods" <woods%planix.ca@localhost> wrote:
    >> > Subject: Re: timekeeping regression?
    >> >>
    >> >> Still looking for a Round Tuit to do the investigation into
    >> why >> "clocksource=tsc" isn't taking effect, and it'll have to
    >> wait a >> couple more weeks now, so if anyone else beats me to
    >> it.....
    >> 
    >> > I did some code-reading and added some printk's to the Xen
    >> kernel > and discovered the reason "clocksource=tsc" didn't work
    >> is because > none of my Xen machines have
    >> X86_FEATURE_TSC_RELIABLE, and when I > faked it the "warp"
    >> detection check invalidated it anyway.
    >> 
    >> > I think I just discovered the difference between the "good" and
    >> > "bad" machines.  The "good" ones both still had >
    >> "dom0_vcpus_pin=true".
    >> 
    >> > I'll reboot the bad one with that added again soon and see how
    >> it > does.
    >> 
    >> This is interesting. I imagine that the current timecounter(9) MD
    >> code doesn't factor in the backing hardware physical CPU being
    >> yanked from under it, assuming that it then relies on the TSC
    >> from it, for timekeeping.
    >> 
    >> Did you say that this is only relevant for "clocksource=tsc"
    >> again ?

    > No, I'm saying "clocksource=tsc" has no effect whatsoever on any
    > of the machines I have.  The CPUs are Xenon 54xx and 56xx, and Xen
    > finds their TSC registers can "warp" backwards in time (i.e. they
    > really are unreliable), so Xen refuses to use them as its platform
    > timer even if I hack the code to fake a TSC_RELIABLE id bit.

    > 	TSC warp detected, disabling TSC_RELIABLE

    > So the platform timer stays as HPET.

    > 	Platform timer is 14.318MHz HPET

    > However that only seems to work reliably when the dom0 CPUs are
    > "pinned", or maybe if there's only one dom0 CPU.  Here I'm running
    > all dom0's with multiple CPUs (normally 2, but up to 8 in one
    > case).

    > Normally I had always used "dom0_vcpus_pin=true", but I had
    > removed it on the one machine following what turns out to be
    > incomplete advice in the NetBSD Xen HowTo about this option.

    > Since I'm typing this mail on a VM of the "bad" (not-pinned)
    > machine I'll reboot it after it is sent.

    > I suspect the problem might be in how the dom0 timecounter is
    > sourced from the Xen kernel, but I don't know the code and I don't
    > know how or why not being pinned to a pCPU might affect it.

    > Note, I found some old discussion related to the origin of the
    > dom0_vcpus_pin option that suggested it was necessary to allow
    > dom0 vCPUs to actually be pinned to pCPUs and in effect to prevent
    > the Xen kernel from trying to do CPU clock scaling (on those
    > pCPUs), but I don't think clock scaling is even possible in the
    > first place on any of the CPUs I have running Xen.

This is useful, thank you - I'll be looking more closely at
timecounter(9) in the context of investigating tickless in the next few
weeks, and Xen will be a first-class citizen for my testing setup.

I'll get back if I have any further questions. Thanks!

-- 
MatC/(~cherry)


Home | Main Index | Thread Index | Old Index