Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: annual howto rampage



At Thu, 11 Jul 2024 15:46:01 -0400, Greg Troxel <gdt%lexort.com@localhost> wrote:
Subject: annual howto rampage
>
>   - Are there any problems with 4.18 (other than timekeeping)?

I've seen the odd reboot (two, exactly, one on each machine) from the
Xen kernel watchdog firing for no apparent reason.

Well I guess the reason is the NetBSD dom0 kernel locked up hard enough
that it didn't tickle Xen any more, but there was no evidence on the
console:

[Fri Jul 12 06:57:25 2024][ 660210.6923526] xen_rtc_set: Setting to 1720792646.009459000 s at systime 660180665194637 ns (nanouptime: 660210692352600 ns, diff(st-nt): -30.27157963 s)
[Fri Jul 12 07:06:16 2024][ 660739.8322481] xen_rtc_set: Setting to 1720793175.149355000 s at systime 660710772767673 ns (nanouptime: 660739832824235 ns, diff(st-nt): -29.60056562 s)
[Fri Jul 12 07:14:41 2024](XEN) [2024-07-12 14:14:39.928] Watchdog timer fired for domain 0
[Fri Jul 12 07:14:41 2024](XEN) [2024-07-12 14:14:39.928] Hardware Dom0 shutdown: watchdog rebooting machine

The first column of timestamps is from conserver.  There would have been
another of those xen_rtc_set messages at about 07:15:05 I think (another
~530 seconds) so no clue there....  (and /var/log/kern was entirely lost
in the crash it seems)

I may have seen this before with 4.11, but I don't seem to have any
further record of such reboots in my console logs for the local
machines, which only go back to 2023 (and the remote machine has no
console log unfortunately).  (The remote machine has a current uptime of
almost 193 days though, with previous uptimes of closer to one year, so
it doesn't reboot often anyway.)

I currently don't run any instances of NetBSD x86_64 (or i386) on bare
metal so I don't know if there could be similar silent lockups without
Xen or not (though I would bet not, thus my mentioning all this here).


>   - Have we crossed over to 4.18 being the standard approach, and 4.15
>     old?  (I think so but don't want to say that without asking.)

I never even tried 4.15 -- I jumped straight from 4.11 to 4.18.  But
that's mostly because I have a suite of local patches to fix minor
annoyances in Xen and its packages that are time consuming to migrate so
I just didn't want to bother for any intermediate releases.


>   - Is anyone working on newer xen?

If I remember correctly have booted a Xen kernel from the then-current
git HEAD once or twice, and it all seemed to work flawlessly.


>   - URL for PR about timekeeping?

I should create one I guess -- still no real information to put in it
other than "it happens" on these physical CPUs I guess....

--
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgp4ldPSCMZ3d.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index