Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: [save/restore] Page types errors
On Thu, Jun 12, 2008 at 03:16:38PM +0200, Jean-Yves Migeon wrote:
> Hi list,
>
> While making progress for the suspend/resume for a domU (I am able to
> erratically suspend and resume back to console, when disabling some
> xentools checks), I am experiencing difficulties during save/restore
> operation for page tables. I've been banging my head for two days on it
> now, and not able to identify where these errors may come from.
>
> Just to illustrate my problem better (see my explanations below), here's
> a typical xm dmesg output during a save (there are many lines, just
> copy/pasting the relevant ones for this mail):
>
> ...
> (XEN) mm.c:649:d0 Error getting mfn 1eeb0 (pfn 6f5) from L1 entry
> 1eeb0003 for dom1
> (XEN) mm.c:1833:d0 Bad type (saw 28000001 != exp e0000000) for mfn 1ee5b
> (pfn 74a)
> (XEN) mm.c:649:d0 Error getting mfn 1ee5b (pfn 74a) from L1 entry
> 1ee5b003 for dom1
> (XEN) mm.c:1833:d0 Bad type (saw 58000001 != exp e0000000) for mfn 1ee3a
> (pfn 76b)
> (XEN) mm.c:649:d0 Error getting mfn 1ee3a (pfn 76b) from L1 entry
> 1ee3a003 for dom1
> (XEN) mm.c:1833:d0 Bad type (saw 28000001 != exp e0000000) for mfn 1ed18
> (pfn 88d)
> (XEN) mm.c:649:d0 Error getting mfn 1ed18 (pfn 88d) from L1 entry
> 1ed18003 for dom1
> (XEN) mm.c:1833:d0 Bad type (saw 28000001 != exp e0000000) for mfn 1ed16
> (pfn 88f)
> ...
Looks familiar. I did hit similar issues when working on i386 PAE support.
Here it finds a page marked L1, where it exects a writable page.
I don't know why Xen is looking for a writable page here. You should probably
try to track it down inside the Xen kernel to see from where this request
comes from.
>
> When restoring, xend stops the operation when he encounters an error
> related to PD/PT:
>
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:370) Reloading memory
> pages: 0%
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:370) Received all pages
> (0 races)
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:3100%
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:370) Memory reloaded
> (3797 pages)
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:370) ERROR Internal
> error: PT base is b
> ad. pfn=1899 nr=3840 type=00000000 20000000
> [2008-06-12 14:50:09 211] INFO (XendCheckpoint:370) Restore exit with rc=1
> [2008-06-12 14:50:09 211] DEBUG (XendDomainInfo:1779)
> XendDomainInfo.destroy: domid=3
> [2008-06-12 14:50:09 211] DEBUG (XendDomainInfo:1798)
> XendDomainInfo.destroyDomain(3)
> [2008-06-12 14:50:09 211] ERROR (XendDomainInfo:1809)
> XendDomainInfo.destroy: xc.domai
> n_destroy failed.
>
> Here, we have an error related to a PT, for pfn 1899 (76b in hex), which
> you can identify in the xm dmesg output (which let me think that the
> problem is same for both errors).
>
> Now, what I found this far: it happens after restoring the domain
> context (same as a dump-core operation), when xentools restore PD/PT
> (VM) mappings of the domain, and makes a couple of checks on it (Xen is
> tracking page types).
>
> According to xen kernel sources, the types would translate as follow
> (see in include/asm-x86/mm.h):
> 28000001: L2 page, pinned
From my sources (xen 3.1.4) that would be L2 page, validated
> 58000001: GDT page, pinned
for me that would be L2 page, validated, pinned.
> e0000000: writable page, not pinned
yes.
>
> Question is: Xen is expecting this pages to be writable ones, and found
> them as L2 page or GDT pages, pinned by guest. Before looking deeper
> into pmap, anybody knows if there's port-xen's code which could result
> in such warnings? Or am I mistaken?
NetBSD uses the hypercall interface for managing the page tables. I suspect
linux isn't using this anymore, and it wouldn't be the first time
that bugs would slip in the MMU hypercall interface. I had to fix some
to get i386PAE working.
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index