tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: PHP performance on Xen domU with mulitple vcpu
> Date: Thu, 3 Apr 2025 15:41:13 +0000
> From: Emmanuel Dreyfus <manu%netbsd.org@localhost>
>
> Oh, yes, good pick! I use kern.timecounter.hardware=clockinterrupt
> because with xen_system_time the domU's ntpd is unable to keep in sync.
I have been hearing about weird issues with xen_system_time but I
don't have the evidence to diagnose them. Flying blind, can you try
the attached patch with kern.timecounter.hardware=xen_system_time, and
see if it helps ntpd?
And, can you send output of the following before and after you've
observed trouble with ntpd (with or without the patch, or both -- just
tell me which)?
vmstat -e | grep -E 'cpu.*(time|tsc|clock)'
And, can you try running the following on a system whose clock is
misbehaving, and share a reasonable sampling of output if you see any?
dtrace -n 'sdt:xen:clock:, sdt:xen:timecounter:, sdt:xen:hardclock:systime-backward, sdt:xen:hardclock:jump, sdt:xen:hardclock:missed { printf("%d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6) }'
# HG changeset patch
# User Taylor R Campbell <riastradh%NetBSD.org@localhost>
# Date 1722556230 0
# Thu Aug 01 23:50:30 2024 +0000
# Branch trunk
# Node ID 4f381da5d8e4b1ec930db273b766af3582d3d3e5
# Parent e882dda136e68b8df443e29104d641a88888e0cb
# EXP-Topic riastradh-2024-xentime
xen_clock: Simplify global timecounter monotonicity.
Instead of trying to estimate skew under the premise that it is
relatively smooth over time, just make sure to return the highest
number observed globally so far each time, like Linux and FreeBSD.
XXX PR port-xen/NNNNN
diff -r e882dda136e6 -r 4f381da5d8e4 sys/arch/xen/xen/xen_clock.c
--- a/sys/arch/xen/xen/xen_clock.c Wed Aug 07 09:48:18 2024 +0000
+++ b/sys/arch/xen/xen/xen_clock.c Thu Aug 01 23:50:30 2024 +0000
@@ -522,39 +522,34 @@ static uint64_t
xen_global_systime_ns(void)
{
struct cpu_info *ci;
- uint64_t local, global, skew, result;
+ uint64_t local, global, result;
/*
* Find the local timecount on this CPU, and make sure it does
* not precede the latest global timecount witnessed so far by
- * any CPU. If it does, add to the local CPU's skew from the
- * fastest CPU.
+ * any CPU.
*
* XXX Can we avoid retrying if the CAS fails?
*/
int s = splsched(); /* make sure we won't be interrupted */
ci = curcpu();
+ local = xen_vcputime_systime_ns();
do {
- local = xen_vcputime_systime_ns();
- skew = ci->ci_xen_systime_ns_skew;
- global = xen_global_systime_ns_stamp;
- if (__predict_false(local + skew < global + 1)) {
+ global = atomic_load_relaxed(&xen_global_systime_ns_stamp);
+ if (__predict_false(local < global)) {
SDT_PROBE3(sdt, xen, timecounter, backward,
- local, skew, global);
+ local, /*skew*/0, global);
#if XEN_CLOCK_DEBUG
device_printf(ci->ci_dev,
- "xen timecounter went backwards:"
- " local=%"PRIu64" skew=%"PRIu64" global=%"PRIu64","
- " adding %"PRIu64" to skew\n",
- local, skew, global, global + 1 - (local + skew));
+ "xen timecounter fell behind:"
+ " local=%"PRIu64" global=%"PRIu64"\n",
+ local, global);
#endif
ci->ci_xen_timecounter_backwards_evcnt.ev_count++;
- result = global + 1;
- ci->ci_xen_systime_ns_skew += global + 1 -
- (local + skew);
- } else {
- result = local + skew;
+ result = global;
+ break;
}
+ result = local;
} while (atomic_cas_64(&xen_global_systime_ns_stamp, global, result)
!= global);
KASSERT(ci == curcpu());
Home |
Main Index |
Thread Index |
Old Index