Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Starting save/restore for port-xen - initial questions
Hi list,
As it is my first mail on this list, and, to some extent, the first
"big" one, I shall introduce myself: my name is Jean-Yves Migeon, a 24
year old french student (currently in Paris), who encountered NetBSD
during his studies in his school. I started as a self-taught system
administrator for the students network.
For the sake of curiosity, I started to read books (and a bit of code)
dealing with kernel, to gain some understanding about its internals; but
consider me a complete kernel noobie.
The current year in my scholarship has some time reserved for personal
work, termed a "project". I asked whether I could use this time to start
contributing to NetBSD; it was kindly accepted. I ended up starting to
work for the suspend/resume functionality in port-xen, under the
supervision of Manuel Bouyer (bouyer@) and Stoned Elipot (seb@), who I
both thank for accepting this proposal.
Before jumping right into hacking, I have questions regarding port-xen.
Mr Bouyer gave me some pointers to understand the internals involved in
Xen (the way it works basically, and its API). However, as it is my
first time in kernel coding, and as I am a complete kernel rookie, there
are many holes to fill before I can start making some diffing :)
From what I understand so far, the suspend-save/resume functionality
from Xen could be (loosely?) compared to the suspend/resume
functionality found on laptops (hibernate and the like):
- a domU is informed (through xenbus) that it should start preparing for
suspend
- it iterates through all its devices to put them in a suspend state (==
putting the frontend drivers in suspend mode, thus flushing the virtual
interrupts handlers),
- manipulate the event channels accordingly (I guess that putting the
virtual drivers into suspend does also affect backend drivers from dom0
- console comes to mind),
- save some extra structures, like grant tables, trap handlers, ...,
from domU, to restore them properly later. And call HYPERVISOR_suspend().
Rolling these steps backwards would describe the restore process, where
the kernel starts again from its last state, while re-establishing the
communication with hypervisor.
Hence, I have some questions. Having extra pointers to areas in
/usr/src/sys/ (I am mostly relying on ctags right now...) would be of
great help. Note that I am not making any difference between "suspend"
and "save", and "resume" and "restore". Please correct me if I am wrong.
- Firstly, what about the structures shared between the domU and
hypervisor, which are "context" specific? machine to physical (and their
reverse counterpart, physical to machine) mappings come to mind, as
there is no warranty that during a restore, physical addresses will be
the exact same as before suspend. Which parts of the kernel should it
affect (besides VM management code) for domU, and most important, where,
in arch/xen? arch/i386? sys/uvm?
- Same question goes for externally dependent mechanisms, like, TCP
connections, which will inevitably timeout if we suspend the domain for
a long time, and clock syncing (since domains keep track of time
independently from others, if I undestood the Xen documentation
correctly - the TSC being bound to one VCPU, and thus, to one particular
domain)
- many files in port-xen already contain code dealing with save and
restore operations: xenbus, backend (xbd), grant tables (xengnt), ...
Can I use them as reference to understand the key differences between a
full domain start and a restore? *_attach() usually calls *_resume()
once it has finished its operations (see arch/xen/xen/xbd_xenbus.c:243
for example); I guess that this code was mainly tested in a
"traditional" boot up phase, and not with a restore operation. If no,
feel free to correct me. If yes, did the code using *_resume() land
somewhere?
- arch/xen/xen/ctrl_if.c: seems to contain some code for controller
interface suspend and resume (ctrl_if_suspend() and ctrl_if_resume() ).
ctrl_if_suspend() is "#ifdef notyet", how should I interpret this part
of the code (see previous question)?
- arch/i386 has some code regarding initial start up for a Xen domain
(arch/i386/i386/machdep.c or vector.S for example). Is there some work
already done regarding suspend (like dumping memory and/or manipulating
the structures shared between hypervisor and domain), or does it start
anew, besides what we can find in arch/xen?
Those questions are kind of general, to understand the concept behind
port-xen, and what needs to be done (and to know if I am heading in the
right direction, or not).
More will of course follow, while dwelling through the code, thanks to
your answers. Apologizes for such a lengthy mail. Others will be
concise, I promise :)
Thanking you in advance for your time and help,
Kind regards
jy
--
Jean-Yves Migeon
jean-yves.migeon%espci.fr@localhost
Home |
Main Index |
Thread Index |
Old Index