Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: timekeeping regression?
>>>>> Greg A Woods <woods%planix.ca@localhost> writes:
> At Wed, 06 Mar 2024 04:25:45 +0000, "Mathew, Cherry G.*" <c%bow.st@localhost> wrote:
> Subject: Re: timekeeping regression?
>>
>> >>>>> Greg A Woods <woods%planix.ca@localhost> writes:
>>
>> > At Sun, 18 Feb 2024 01:40:52 -0800, "Greg A. Woods" <woods%planix.ca@localhost> wrote:
>> > Subject: Re: timekeeping regression?
>> >>
>> >> Still looking for a Round Tuit to do the investigation into
>> why >> "clocksource=tsc" isn't taking effect, and it'll have to
>> wait a >> couple more weeks now, so if anyone else beats me to
>> it.....
>>
>> > I did some code-reading and added some printk's to the Xen
>> kernel > and discovered the reason "clocksource=tsc" didn't work
>> is because > none of my Xen machines have
>> X86_FEATURE_TSC_RELIABLE, and when I > faked it the "warp"
>> detection check invalidated it anyway.
>>
>> > I think I just discovered the difference between the "good" and
>> > "bad" machines. The "good" ones both still had >
>> "dom0_vcpus_pin=true".
>>
>> > I'll reboot the bad one with that added again soon and see how
>> it > does.
>>
>> This is interesting. I imagine that the current timecounter(9) MD
>> code doesn't factor in the backing hardware physical CPU being
>> yanked from under it, assuming that it then relies on the TSC
>> from it, for timekeeping.
>>
>> Did you say that this is only relevant for "clocksource=tsc"
>> again ?
> No, I'm saying "clocksource=tsc" has no effect whatsoever on any
> of the machines I have. The CPUs are Xenon 54xx and 56xx, and Xen
> finds their TSC registers can "warp" backwards in time (i.e. they
> really are unreliable), so Xen refuses to use them as its platform
> timer even if I hack the code to fake a TSC_RELIABLE id bit.
> TSC warp detected, disabling TSC_RELIABLE
> So the platform timer stays as HPET.
> Platform timer is 14.318MHz HPET
> However that only seems to work reliably when the dom0 CPUs are
> "pinned", or maybe if there's only one dom0 CPU. Here I'm running
> all dom0's with multiple CPUs (normally 2, but up to 8 in one
> case).
> Normally I had always used "dom0_vcpus_pin=true", but I had
> removed it on the one machine following what turns out to be
> incomplete advice in the NetBSD Xen HowTo about this option.
> Since I'm typing this mail on a VM of the "bad" (not-pinned)
> machine I'll reboot it after it is sent.
> I suspect the problem might be in how the dom0 timecounter is
> sourced from the Xen kernel, but I don't know the code and I don't
> know how or why not being pinned to a pCPU might affect it.
> Note, I found some old discussion related to the origin of the
> dom0_vcpus_pin option that suggested it was necessary to allow
> dom0 vCPUs to actually be pinned to pCPUs and in effect to prevent
> the Xen kernel from trying to do CPU clock scaling (on those
> pCPUs), but I don't think clock scaling is even possible in the
> first place on any of the CPUs I have running Xen.
This is useful, thank you - I'll be looking more closely at
timecounter(9) in the context of investigating tickless in the next few
weeks, and Xen will be a first-class citizen for my testing setup.
I'll get back if I have any further questions. Thanks!
--
MatC/(~cherry)
Home |
Main Index |
Thread Index |
Old Index