Source-Changes-HG archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
[src/uebayasi-xip]: src/sys/uvm uvmfault_promote: For promotion from a "lower...
details: https://anonhg.NetBSD.org/src/rev/356799bfc118
branches: uebayasi-xip
changeset: 751576:356799bfc118
user: uebayasi <uebayasi%NetBSD.org@localhost>
date: Fri Feb 12 16:06:50 2010 +0000
description:
uvmfault_promote: For promotion from a "lower" page, pass the belonging struct
uvm_object * from callers, because device page struct vm_page * doesn't have
a back-pointer to the uvm_object.
diffstat:
sys/uvm/uvm_fault.c | 2284 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 2284 insertions(+), 0 deletions(-)
diffs (truncated from 2288 to 300 lines):
diff -r f4e55e886893 -r 356799bfc118 sys/uvm/uvm_fault.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/sys/uvm/uvm_fault.c Fri Feb 12 16:06:50 2010 +0000
@@ -0,0 +1,2284 @@
+/* $NetBSD: uvm_fault.c,v 1.166.2.2 2010/02/12 16:06:50 uebayasi Exp $ */
+
+/*
+ *
+ * Copyright (c) 1997 Charles D. Cranor and Washington University.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. All advertising materials mentioning features or use of this software
+ * must display the following acknowledgement:
+ * This product includes software developed by Charles D. Cranor and
+ * Washington University.
+ * 4. The name of the author may not be used to endorse or promote products
+ * derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * from: Id: uvm_fault.c,v 1.1.2.23 1998/02/06 05:29:05 chs Exp
+ */
+
+/*
+ * uvm_fault.c: fault handler
+ */
+
+#include <sys/cdefs.h>
+__KERNEL_RCSID(0, "$NetBSD: uvm_fault.c,v 1.166.2.2 2010/02/12 16:06:50 uebayasi Exp $");
+
+#include "opt_uvmhist.h"
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/kernel.h>
+#include <sys/proc.h>
+#include <sys/malloc.h>
+#include <sys/mman.h>
+
+#include <uvm/uvm.h>
+
+/*
+ *
+ * a word on page faults:
+ *
+ * types of page faults we handle:
+ *
+ * CASE 1: upper layer faults CASE 2: lower layer faults
+ *
+ * CASE 1A CASE 1B CASE 2A CASE 2B
+ * read/write1 write>1 read/write +-cow_write/zero
+ * | | | |
+ * +--|--+ +--|--+ +-----+ + | + | +-----+
+ * amap | V | | ---------> new | | | | ^ |
+ * +-----+ +-----+ +-----+ + | + | +--|--+
+ * | | |
+ * +-----+ +-----+ +--|--+ | +--|--+
+ * uobj | d/c | | d/c | | V | +----+ |
+ * +-----+ +-----+ +-----+ +-----+
+ *
+ * d/c = don't care
+ *
+ * case [0]: layerless fault
+ * no amap or uobj is present. this is an error.
+ *
+ * case [1]: upper layer fault [anon active]
+ * 1A: [read] or [write with anon->an_ref == 1]
+ * I/O takes place in upper level anon and uobj is not touched.
+ * 1B: [write with anon->an_ref > 1]
+ * new anon is alloc'd and data is copied off ["COW"]
+ *
+ * case [2]: lower layer fault [uobj]
+ * 2A: [read on non-NULL uobj] or [write to non-copy_on_write area]
+ * I/O takes place directly in object.
+ * 2B: [write to copy_on_write] or [read on NULL uobj]
+ * data is "promoted" from uobj to a new anon.
+ * if uobj is null, then we zero fill.
+ *
+ * we follow the standard UVM locking protocol ordering:
+ *
+ * MAPS => AMAP => UOBJ => ANON => PAGE QUEUES (PQ)
+ * we hold a PG_BUSY page if we unlock for I/O
+ *
+ *
+ * the code is structured as follows:
+ *
+ * - init the "IN" params in the ufi structure
+ * ReFault:
+ * - do lookups [locks maps], check protection, handle needs_copy
+ * - check for case 0 fault (error)
+ * - establish "range" of fault
+ * - if we have an amap lock it and extract the anons
+ * - if sequential advice deactivate pages behind us
+ * - at the same time check pmap for unmapped areas and anon for pages
+ * that we could map in (and do map it if found)
+ * - check object for resident pages that we could map in
+ * - if (case 2) goto Case2
+ * - >>> handle case 1
+ * - ensure source anon is resident in RAM
+ * - if case 1B alloc new anon and copy from source
+ * - map the correct page in
+ * Case2:
+ * - >>> handle case 2
+ * - ensure source page is resident (if uobj)
+ * - if case 2B alloc new anon and copy from source (could be zero
+ * fill if uobj == NULL)
+ * - map the correct page in
+ * - done!
+ *
+ * note on paging:
+ * if we have to do I/O we place a PG_BUSY page in the correct object,
+ * unlock everything, and do the I/O. when I/O is done we must reverify
+ * the state of the world before assuming that our data structures are
+ * valid. [because mappings could change while the map is unlocked]
+ *
+ * alternative 1: unbusy the page in question and restart the page fault
+ * from the top (ReFault). this is easy but does not take advantage
+ * of the information that we already have from our previous lookup,
+ * although it is possible that the "hints" in the vm_map will help here.
+ *
+ * alternative 2: the system already keeps track of a "version" number of
+ * a map. [i.e. every time you write-lock a map (e.g. to change a
+ * mapping) you bump the version number up by one...] so, we can save
+ * the version number of the map before we release the lock and start I/O.
+ * then when I/O is done we can relock and check the version numbers
+ * to see if anything changed. this might save us some over 1 because
+ * we don't have to unbusy the page and may be less compares(?).
+ *
+ * alternative 3: put in backpointers or a way to "hold" part of a map
+ * in place while I/O is in progress. this could be complex to
+ * implement (especially with structures like amap that can be referenced
+ * by multiple map entries, and figuring out what should wait could be
+ * complex as well...).
+ *
+ * we use alternative 2. given that we are multi-threaded now we may want
+ * to reconsider the choice.
+ */
+
+/*
+ * local data structures
+ */
+
+struct uvm_advice {
+ int advice;
+ int nback;
+ int nforw;
+};
+
+/*
+ * page range array:
+ * note: index in array must match "advice" value
+ * XXX: borrowed numbers from freebsd. do they work well for us?
+ */
+
+static const struct uvm_advice uvmadvice[] = {
+ { MADV_NORMAL, 3, 4 },
+ { MADV_RANDOM, 0, 0 },
+ { MADV_SEQUENTIAL, 8, 7},
+};
+
+#define UVM_MAXRANGE 16 /* must be MAX() of nback+nforw+1 */
+
+/*
+ * private prototypes
+ */
+
+/*
+ * inline functions
+ */
+
+/*
+ * uvmfault_anonflush: try and deactivate pages in specified anons
+ *
+ * => does not have to deactivate page if it is busy
+ */
+
+static inline void
+uvmfault_anonflush(struct vm_anon **anons, int n)
+{
+ int lcv;
+ struct vm_page *pg;
+
+ for (lcv = 0; lcv < n; lcv++) {
+ if (anons[lcv] == NULL)
+ continue;
+ mutex_enter(&anons[lcv]->an_lock);
+ pg = anons[lcv]->an_page;
+ if (pg && (pg->flags & PG_BUSY) == 0) {
+ mutex_enter(&uvm_pageqlock);
+ if (pg->wire_count == 0) {
+ uvm_pagedeactivate(pg);
+ }
+ mutex_exit(&uvm_pageqlock);
+ }
+ mutex_exit(&anons[lcv]->an_lock);
+ }
+}
+
+/*
+ * normal functions
+ */
+
+/*
+ * uvmfault_amapcopy: clear "needs_copy" in a map.
+ *
+ * => called with VM data structures unlocked (usually, see below)
+ * => we get a write lock on the maps and clear needs_copy for a VA
+ * => if we are out of RAM we sleep (waiting for more)
+ */
+
+static void
+uvmfault_amapcopy(struct uvm_faultinfo *ufi)
+{
+ for (;;) {
+
+ /*
+ * no mapping? give up.
+ */
+
+ if (uvmfault_lookup(ufi, true) == false)
+ return;
+
+ /*
+ * copy if needed.
+ */
+
+ if (UVM_ET_ISNEEDSCOPY(ufi->entry))
+ amap_copy(ufi->map, ufi->entry, AMAP_COPY_NOWAIT,
+ ufi->orig_rvaddr, ufi->orig_rvaddr + 1);
+
+ /*
+ * didn't work? must be out of RAM. unlock and sleep.
+ */
+
+ if (UVM_ET_ISNEEDSCOPY(ufi->entry)) {
+ uvmfault_unlockmaps(ufi, true);
+ uvm_wait("fltamapcopy");
+ continue;
+ }
+
+ /*
+ * got it! unlock and return.
+ */
+
+ uvmfault_unlockmaps(ufi, true);
+ return;
+ }
+ /*NOTREACHED*/
+}
+
+/*
+ * uvmfault_anonget: get data in an anon into a non-busy, non-released
+ * page in that anon.
+ *
+ * => maps, amap, and anon locked by caller.
+ * => if we fail (result != 0) we unlock everything.
+ * => if we are successful, we return with everything still locked.
+ * => we don't move the page on the queues [gets moved later]
+ * => if we allocate a new page [we_own], it gets put on the queues.
+ * either way, the result is that the page is on the queues at return time
+ * => for pages which are on loan from a uvm_object (and thus are not
+ * owned by the anon): if successful, we return with the owning object
+ * locked. the caller must unlock this object when it unlocks everything
+ * else.
+ */
+
+int
+uvmfault_anonget(struct uvm_faultinfo *ufi, struct vm_amap *amap,
+ struct vm_anon *anon)
+{
+ bool we_own; /* we own anon's page? */
+ bool locked; /* did we relock? */
+ struct vm_page *pg;
+ int error;
+ UVMHIST_FUNC("uvmfault_anonget"); UVMHIST_CALLED(maphist);
+
+ KASSERT(mutex_owned(&anon->an_lock));
+
+ error = 0;
+ uvmexp.fltanget++;
+ /* bump rusage counters */
+ if (anon->an_page)
+ curlwp->l_ru.ru_minflt++;
Home |
Main Index |
Thread Index |
Old Index