Subject: Re: persistent/restorable unix procs?
To: Travis Hassloch <travis@evtech.com>
From: Tom Pavel <PAVEL@SLAC.Stanford.EDU>
List: current-users
Date: 11/29/1995 14:20:19
>>>>> On Tue, 28 Nov 1995, Travis Hassloch <travis@evtech.com> writes:
> Has anyone done any work on (or looking into) how one might dump a process's
> state to disk & restore it, assuming it's a cooperating process?
> E.G. maybe you have a process that wants to save itself on purpose.
> To make it harder, how about an uncooperative process?
The task of saving a process's state is very much akin to the task of
migrating processes around nodes on a distributed network (of which there
is a good deal of literature). Basically, the core file gives you most of
what you need, but you also have to save/transfer the "hidden" state of
file descriptors, sockets, signals, and so forth. This is much easier in
systems built from the ground up with process migration in mind (Sprite, V,
Plan-9), but can be done in Unix with appropriate mods to the kernel and/or
libc. Things are much easier if the process is "cooperative," in that it
avoids using tricky Unix features that are hard to checkpoint or duplicate
(forking subprocesses, opening devices, IPC, etc.) How much work you need
to do depends on the scope of your problem...
Here is a collection of papers from groups who have done such a thing for
Unix systems. These papers are a bit old, but they might still prove
interesting reading. As for getting your hands on code, I would think
Condor is probably your best bet (ftp.cs.wisc.edu:/condor).
K.I. Mandelberg & V.S. Sunderam, "Process Migration in Unix Networks,"
Winter Usenix 1988, p. 357.
Chad Hunter, "Process Cloning: A System for Duplicating Unix Processes,"
Winter Usenix 1988, p.373.
David Nichols, "Using Idle Workstations in a Shared Computing Environment,"
Proceedings of the 11th ACM Symposium on Operating Systems Principles
(SOSP), (1987) p. 5.
Rafael Alonso & Kriton Kyrimis, "A Process Migration Implementation for a
Unix System," Winter Usenix 1988, p.365.
M. J. Litzkow, M. Livny, and M. W. Mutka, "Condor - A Hunter of Idle
Workstations," Proc. 8th Int'l. Conf. on Distr. Computing Sys., June 1988.
Good luck,
Tom Pavel
Stanford Linear Accelerator Center
pavel@slac.stanford.edu http://www.slac.stanford.edu/~pavel/