Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"



> Date: Sat, 08 Jul 2023 14:34:56 -0400
> From: Brad Spencer <brad%anduin.eldar.org@localhost>
> 
> Taylor R Campbell <riastradh%NetBSD.org@localhost> writes:
> 
> > Can you either:
> 
> Yes, I can perform as much of this as needed after I get some other
> stuff in life dealt with more towards the end of the month.  I really
> won't have any time before then.

No worries!

> > 1. share the output of `vmstat -e | grep -e tsc -e systime -e
> >    hardclock' after you get the console warning;
> 
> The DOMU currently only has 1 vcpu, but here is the output now:
> 
> vcpu0 raw systime went backwards                          46579    0 intr
> 
> When I have real time later I will force the negative runtime to happen
> and run the above again.

This is evidence that the hypervisor is doing something wrong with the
clock it exposes to the guest.  However, on a single-vCPU system, we
work around this by noting the last Xen systime recorded on the
current vCPU, and pretending the clock just hadn't changed since then.

On a multi-vCPU system, we also try to work around it by recording a
clock skew in xen_global_systime_ns and applying it to ensure the
timestamp is monotonic, but perhaps that's not working right -- or
perhaps it is working for 64-bit timestamps, but the jumps are so
large that they wrap around the 32-bit timecounter arithmetic.

> > 2. run
> >
> >    dtrace -n 'sdt:xen:clock: { printf("%d %d %d %d %d %d %d",
> >    arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'

Note: this should now be

dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'

> >    on the system, and leave it running with output directed to a file,
> >    and share the output when you see the console warning; or
> 
> The DOMU is a 9.3_STABLE from around November 8th and when I attempted
> to run the above dtrace it didn't work.  I got this in the messsages:
> 
> [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_calltrap_addr' not found
> [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_jump_addr' not found
> [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_trap_func' not found
> [ 1792486.921759] WARNING: module error: unable to affix module `dtrace', error 8

Looks like nobody has wired up dtrace to Xen!  That's a pretty serious
regression of Xen vs native x86.  Someone needs to hook these up.

In the mean time, I've add a little more diagnostics to HEAD -- if you
can boot a current kernel, that might help, or I could try to make the
corresponding changes on netbsd-9.

https://mail-index.netbsd.org/source-changes/2023/07/13/msg145973.html
https://mail-index.netbsd.org/source-changes/2023/07/13/msg145974.html


Home | Main Index | Thread Index | Old Index