Subject: Re: Real vfork() (was: third results)
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Stefan Grefen <grefen@hprc.tandem.com>
List: tech-kern
Date: 04/15/1998 13:45:52
In message <199804141815.LAA14816@lestat.nas.nasa.gov> Jason Thorpe wrote:
> On Tue, 14 Apr 1998 19:52:24 +0200
> Stefan Grefen <grefen@hprc.tandem.com> wrote:
>
> > I think all effort should be directed to making fork()'s COW cheaper so that
> > the remaing benefit of vfork is that it blocks the parent until the
>
> A good amount of effort was directed at making COW better in UVM. And
> an address space-sharing vfork() _still_ turned out to be a win. It shaves
> several seconds off a build of libc on my 200MHz PPro.
Thats how many percent?
>
> I really don't understand why we're arguing about this. It seems obvious to
> me that, in the cases where it was originally meant to be used, it is a
> performance win, and really nothing else is going to be faster.
I agree, but it is still a kludge and I think the bad practise it creates
(not all people abstain from exploring the unwanted side-effects)
portability hazards. You have to know if the vm-space is shared or not
unless you restrict yourself to change local variables only.
My personal option is that even a 10% gain doesn't justify this kludge.
But I think we should direct the effort away from creating a spawn()
system call, to do some major changes on the vm side to make vfork() and
COW cheaper. If we would be able do manipulate higher levels than a
page we would reduce the number of entrys to change and page-faults a lot.
I know its a major undertaking, and I don't want us do the stuff SYSV R4
does (which you can still cheat on page-lvel if you want to), because
the overhead for normal operations is significant.
I don't have an answer how to do it in my pocket either ...
>
> Let's look at what happens when you vfork/exec using the 4.4BSD vfork
> and COW:
>
> - Traverse parent's vm_map, marking the writable portions of the
> address space COW. This means invoking the pmap, modifying PTEs,
> and flushing the TLB.
Thats because we can't set a whole object/segment to COW. Else you
would traverse only 4 objects + mmap objects.
>
> - Create a vm_map for the child, copy the parent's vm_map entries
> into the child's vm_map. Optionally, invoke the pmap to copy
> PTEs from the parent's page tables into the child's page tables.
Could be a COW clone.
>
> - Block parent.
>
> - Child runs. If PTEs were _not_ copied, take page fault to get
> a physical mapping for the text page at the current program counter.
>
> - Child execs, and unmaps the entire address space that was just
> created, and creates a new one. This implies that the parent's
> vm_map has to be traversed to mark the COW portions not-COW.
Again only the toplevel objects are affected.
>
> - Unblock parent.
>
> - Parent runs, takes page fault when modifying previously R/W
> data that was marked R/O for COW (no data is copied at this
> time
Takes one per object.
[... vfork stays the same ]
>
> So, in the case where you're going to fork and then exec, which is going
> to be faster? Clearly the one that has to do less work. Even if your
> COW algorithms are good, you still have to do a lot more work compared
> to the vmspace-sharing case!
I think this 'a lot' can be reduced to 'some'. Not 1.3.X not 1.4 but maybe
in 2.0 ?
Stefan
>
> Jason R. Thorpe thorpej@nas.nasa.gov
> NASA Ames Research Center Home: +1 408 866 1912
> NAS: M/S 258-5 Work: +1 650 604 0935
> Moffett Field, CA 94035 Pager: +1 415 428 6939
--
Stefan Grefen Tandem Computers Europe Inc.
grefen@hprc.tandem.com High Performance Research Center
--- Hacking's just another word for nothing left to kludge. ---