Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"

To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Subject: Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"
From: Brad Spencer <brad%anduin.eldar.org@localhost>
Date: Sat, 08 Jul 2023 14:34:56 -0400

Taylor R Campbell <riastradh%NetBSD.org@localhost> writes:

>> Date: Wed, 04 Jan 2023 14:43:25 -0500
>> From: Brad Spencer <brad%anduin.eldar.org@localhost>
>> 
>> So...  I have a PV+PVSHIM DOMU running a pretty recent 9.x on a DOM0
>> running a 9.99.xx kernel.  The DOM0 is not large, a 4 processor E-2224
>> with 32GB of memory.  The DOMU has 2 VCPUs and 8GB of memory.  About
>> every day a very particular DOMU tosses the:
>> 
>> WARNING: negative runtime; monotonic clock has gone backwards
>
> Does this still happen?

As far as I know it still happens.  I ended up running just a single CPU
much of the time on the DOMU, until the system needed more and then I
reboot the DOMU with two vcpus.  The negative runtime does not happen
when there is only one vcpu on the DOMU.

When I forget to reboot back to one CPU in the DOMU, the negative
runtime message has always happened within a couple of days.

> Can you either:

Yes, I can perform as much of this as needed after I get some other
stuff in life dealt with more towards the end of the month.  I really
won't have any time before then.

> 1. share the output of `vmstat -e | grep -e tsc -e systime -e
>    hardclock' after you get the console warning;

The DOMU currently only has 1 vcpu, but here is the output now:

vcpu0 raw systime went backwards                          46579    0 intr

When I have real time later I will force the negative runtime to happen
and run the above again.

> 2. run
>
>    dtrace -n 'sdt:xen:clock: { printf("%d %d %d %d %d %d %d",
>    arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
>
>    on the system, and leave it running with output directed to a file,
>    and share the output when you see the console warning; or

The DOMU is a 9.3_STABLE from around November 8th and when I attempted
to run the above dtrace it didn't work.  I got this in the messsages:

[ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_calltrap_addr' not found
[ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_jump_addr' not found
[ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_trap_func' not found
[ 1792486.921759] WARNING: module error: unable to affix module `dtrace', error 8

When I have time to get to this I can build a newer 9.x world, I just
need to know if I need to do that.

> 3. put `#define XEN_CLOCK_DEBUG 1' in sys/arch/xen/xen/xen_clock.c and
>    build a new kernel, and share the dmesg output when you get the
>    console warning?
>
> This should tell us whether it's the Xen host's fault or something
> wrong in NetBSD.

Some more data points from the vmstat command mentioned above (because
it is simple and quick to run):

1) Another system with the same generation 9.x BUT on a different DOM0
produces:

vcpu0 raw systime went backwards                          32909    0 intr

This second system has also been known to have negative runtime and is
also currently running with one vcpu.

2) On the same DOM0 as the one mentioned in #1 there is a 10.x_BETA from
January 22nd.  This guest is a PVH DOMU and the above vmstat produces no
output.  This full PVH DOMU has two vcpus and I have never known it to
produce negative runtime.

3) A third 9.x DOMU that is just a normal PV (no PVSHIM involved)
produces the following with that vmstat command:

vcpu0 raw systime went backwards                         141532    0 intr
vcpu0 missed hardclock                                        7    0 intr

I have never known this system to have negative runtime.

4) Another 10.x_BETA DOMU PV guest with 1 vcpu from April 22nd on the
same DOM0 as #3 does not produce any output from the vmstat command.

Thanks for asking about this...  it is more than a little annoying.  I
apologize that I won't be able to be very receptive to doing much more
with this until later.

-- 
Brad Spencer - brad%anduin.eldar.org@localhost - KC8VKS - http://anduin.eldar.org

Follow-Ups:
- Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"
  - From: Taylor R Campbell

References:
- Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"
  - From: Taylor R Campbell

Prev by Date: Automated report: NetBSD-current/i386 build failure
Next by Date: Automated report: NetBSD-current/i386 build success
Previous by Thread: Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"
Next by Thread: Re: What to do about "WARNING: negative runtime; monotonic clock has gone backwards"
Indexes:

Home | Main Index | Thread Index | Old Index