Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/sys Add some more meat to madvise(2):



details:   https://anonhg.NetBSD.org/src/rev/d51418735aa7
branches:  trunk
changeset: 474461:d51418735aa7
user:      thorpej <thorpej%NetBSD.org@localhost>
date:      Wed Jul 07 06:02:21 1999 +0000

description:
Add some more meat to madvise(2):
* Implement MADV_DONTNEED: deactivate pages in the specified range,
  semantics similar to Solaris's MADV_DONTNEED.
* Add MADV_FREE: free pages and swap resources associated with the
  specified range, causing the range to be reloaded from backing
  store (vnodes) or zero-fill (anonymous), semantics like FreeBSD's
  MADV_FREE and like Digital UNIX's MADV_DONTNEED (isn't it SO GREAT
  that madvise(2) isn't standardized!?)

As part of this, move the non-map-modifying advice handling out of
uvm_map_advise(), and into sys_madvise().

As another part, implement general amap cleaning in uvm_map_clean(), and
change uvm_map_clean() to only push dirty pages to disk if PGO_CLEANIT
is set in its flags (and update sys___msync13() accordingly).  XXX Add
a patchable global "amap_clean_works", defaulting to 1, which can disable
the amap cleaning code, just in case problems are unearthed; this gives
a developer/user a quick way to recover and send a bug report (e.g. boot
into DDB and change the value).

XXX Still need to implement a real uao_flush().

XXX Need to update the manual page.

With these changes, rebuilding libc will automatically cause the new
malloc(3) to use MADV_FREE to actually release pages and swap resources
when it decides that can be done.

diffstat:

 sys/sys/mman.h     |    3 +-
 sys/uvm/uvm_map.c  |  250 ++++++++++++++++++++++++++++++++--------------------
 sys/uvm/uvm_mmap.c |   80 +++++++++++++++-
 3 files changed, 228 insertions(+), 105 deletions(-)

diffs (truncated from 471 to 300 lines):

diff -r 98cc1062113b -r d51418735aa7 sys/sys/mman.h
--- a/sys/sys/mman.h    Wed Jul 07 05:33:33 1999 +0000
+++ b/sys/sys/mman.h    Wed Jul 07 06:02:21 1999 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: mman.h,v 1.23 1999/06/17 21:07:55 thorpej Exp $        */
+/*     $NetBSD: mman.h,v 1.24 1999/07/07 06:02:21 thorpej Exp $        */
 
 /*-
  * Copyright (c) 1982, 1986, 1993
@@ -114,6 +114,7 @@
 #define        MADV_WILLNEED   3       /* will need these pages */
 #define        MADV_DONTNEED   4       /* dont need these pages */
 #define        MADV_SPACEAVAIL 5       /* insure that resources are reserved */
+#define        MADV_FREE       6       /* pages are empty, free them */
 #endif
 
 #ifndef _KERNEL
diff -r 98cc1062113b -r d51418735aa7 sys/uvm/uvm_map.c
--- a/sys/uvm/uvm_map.c Wed Jul 07 05:33:33 1999 +0000
+++ b/sys/uvm/uvm_map.c Wed Jul 07 06:02:21 1999 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: uvm_map.c,v 1.60 1999/07/01 20:07:05 thorpej Exp $     */
+/*     $NetBSD: uvm_map.c,v 1.61 1999/07/07 06:02:22 thorpej Exp $     */
 
 /* 
  * Copyright (c) 1997 Charles D. Cranor and Washington University.
@@ -1871,7 +1871,11 @@
        } else {
                entry = temp_entry->next;
        }
-       
+
+       /*
+        * XXXJRT: disallow holes?
+        */
+
        while ((entry != &map->header) && (entry->start < end)) {
                UVM_MAP_CLIP_END(map, entry, end);
 
@@ -1882,65 +1886,6 @@
                        /* nothing special here */
                        break;
 
-#if 0
-               case MADV_WILLNEED:
-                       /* activate all these pages */
-                       /* XXX */
-                       /*
-                        * should invent a "weak" mode for uvm_fault()
-                        * which would only do the PGO_LOCKED pgo_get().
-                        */
-                       break;
-
-               case MADV_DONTNEED:
-                       /* deactivate this page */
-                       /* XXX */
-                       /*
-                        * vm_page_t p;
-                        * uvm_lock_pageq();
-                        * for (p in each page)
-                        *      if (not_wired)
-                        *              uvm_pagedeactivate(p);
-                        * uvm_unlock_pageq();
-                        */
-                       break;
-
-               case MADV_SPACEAVAIL:
-                       /* 
-                        * XXXMRG
-                        * what is this?  i think:  "ensure that we have
-                        * allocated backing-store for these pages".  this
-                        * is going to require changes in the page daemon,
-                        * as it will free swap space allocated to pages in
-                        * core.  there's also what to do for
-                        * device/file/anonymous memory..
-                        */
-                       break;
-
-               case MADV_GARBAGE:
-                       /* pages are `empty' and can be garbage collected */
-                       /* XXX */
-                       /*
-                        * (perhaps MADV_FREE? check freebsd's MADV_FREE).
-                        * 
-                        * need to do this:
-                        *      - clear all the referenced and modified bits on
-                        *        the pages,
-                        *      - delete any backing store,
-                        *      - mark the page as `recycable'.
-                        *
-                        * So, if you start paging, the pages would be thrown out
-                        * and then zero-filled the next time they're used.
-                        * Otherwise you'd just reuse them directly.  Once the
-                        * page has been modified again, it would no longer be
-                        * recyclable.  That way, malloc() can just tell the
-                        * system when pages are `empty'; if memory is needed,
-                        * they'll be tossed; if memory is not needed, there
-                        * will be no additional overhead.
-                        */
-                       break;
-#endif
-
                default:
                        vm_map_unlock(map);
                        UVMHIST_LOG(maphist,"<- done (INVALID ARG)",0,0,0,0);
@@ -2241,7 +2186,7 @@
                        if (VM_MAPENT_ISWIRED(entry))
                                uvm_map_entry_unwire(map, entry);
                }
-               map->flags &= ~VM_MAP_WIREFUTURE;
+               vm_map_modflags(map, 0, VM_MAP_WIREFUTURE);
                vm_map_unlock(map);
                UVMHIST_LOG(maphist,"<- done (OK UNWIRE)",0,0,0,0);
                return (KERN_SUCCESS);
@@ -2255,7 +2200,7 @@
                /*
                 * must wire all future mappings; remember this.
                 */
-               map->flags |= VM_MAP_WIREFUTURE;
+               vm_map_modflags(map, VM_MAP_WIREFUTURE, 0);
        }
 
        if ((flags & MCL_CURRENT) == 0) {
@@ -2413,38 +2358,49 @@
 }
 
 /*
- * uvm_map_clean: push dirty pages off to backing store.
+ * uvm_map_clean: clean out a map range
  *
  * => valid flags:
+ *   if (flags & PGO_CLEANIT): dirty pages are cleaned first
  *   if (flags & PGO_SYNCIO): dirty pages are written synchronously
  *   if (flags & PGO_DEACTIVATE): any cached pages are deactivated after clean
  *   if (flags & PGO_FREE): any cached pages are freed after clean
  * => returns an error if any part of the specified range isn't mapped
  * => never a need to flush amap layer since the anonymous memory has 
- *     no permanent home...
- * => called from sys_msync()
+ *     no permanent home, but may deactivate pages there
+ * => called from sys_msync() and sys_madvise()
  * => caller must not write-lock map (read OK).
  * => we may sleep while cleaning if SYNCIO [with map read-locked]
  */
 
+int    amap_clean_works = 1;   /* XXX for now, just in case... */
+
 int
 uvm_map_clean(map, start, end, flags)
        vm_map_t map;
        vaddr_t start, end;
        int flags;
 {
-       vm_map_entry_t current;
-       vm_map_entry_t entry;
+       vm_map_entry_t current, entry;
+       struct uvm_object *uobj;
+       struct vm_amap *amap;
+       struct vm_anon *anon;
+       struct vm_page *pg;
+       vaddr_t offset;
        vsize_t size;
-       struct uvm_object *object;
-       vaddr_t offset;
+       int rv, error, refs;
        UVMHIST_FUNC("uvm_map_clean"); UVMHIST_CALLED(maphist);
        UVMHIST_LOG(maphist,"(map=0x%x,start=0x%x,end=0x%x,flags=0x%x)",
        map, start, end, flags);
 
+#ifdef DIAGNOSTIC
+       if ((flags & (PGO_FREE|PGO_DEACTIVATE)) == (PGO_FREE|PGO_DEACTIVATE))
+               panic("uvm_map_clean: FREE and DEACTIVATE");
+#endif
+
        vm_map_lock_read(map);
        VM_MAP_RANGE_CHECK(map, start, end);
-       if (!uvm_map_lookup_entry(map, start, &entry)) {
+       if (uvm_map_lookup_entry(map, start, &entry) == FALSE) {
                vm_map_unlock_read(map);
                return(KERN_INVALID_ADDRESS);
        }
@@ -2464,41 +2420,145 @@
                }
        }
 
-       /* 
-        * add "cleanit" flag to flags (for generic flush routine).  
-        * then make a second pass, cleaning/uncaching pages from 
-        * the indicated objects as we go.  
-        */
-       flags = flags | PGO_CLEANIT;
+       error = KERN_SUCCESS;
+
        for (current = entry; current->start < end; current = current->next) {
-               offset = current->offset + (start - current->start);
-               size = (end <= current->end ? end : current->end) - start;
+               amap = current->aref.ar_amap;   /* top layer */
+               uobj = current->object.uvm_obj; /* bottom layer */
+
+#ifdef DIAGNOSTIC
+               if (start < current->start)
+                       panic("uvm_map_clean: hole");
+#endif
 
                /*
-                * get object/offset.  can't be submap (checked above).
+                * No amap cleaning necessary if:
+                *
+                *      (1) There's no amap.
+                *
+                *      (2) We're not deactivating or freeing pages.
                 */
-               object = current->object.uvm_obj;
-               simple_lock(&object->vmobjlock);
-
+               if (amap == NULL ||
+                   (flags & (PGO_DEACTIVATE|PGO_FREE)) == 0)
+                       goto flush_object;
+
+               /* XXX for now, just in case... */
+               if (amap_clean_works == 0)
+                       goto flush_object;
+
+               amap_lock(amap);
+
+               offset = start - current->start;
+               size = (end <= current->end ? end : current->end) -
+                   start;
+
+               for (/* nothing */; size != 0; size -= PAGE_SIZE,
+                    offset += PAGE_SIZE) {
+                       anon = amap_lookup(&current->aref, offset);
+                       if (anon == NULL)
+                               continue;
+
+                       simple_lock(&anon->an_lock);
+
+                       switch (flags & (PGO_CLEANIT|PGO_FREE|PGO_DEACTIVATE)) {
+                       /*
+                        * XXX In these first 3 cases, we always just
+                        * XXX deactivate the page.  We may want to
+                        * XXX handle the different cases more
+                        * XXX specifically, in the future.
+                        */
+                       case PGO_CLEANIT|PGO_FREE:
+                       case PGO_CLEANIT|PGO_DEACTIVATE:
+                       case PGO_DEACTIVATE:
+                               pg = anon->u.an_page;
+                               if (pg == NULL) {
+                                       simple_unlock(&anon->an_lock);
+                                       continue;
+                               }
+
+                               /* skip the page if it's loaned or wired */
+                               if (pg->loan_count != 0 ||
+                                   pg->wire_count != 0) {
+                                       simple_unlock(&anon->an_lock);
+                                       continue;
+                               }
+
+                               uvm_lock_pageq();
+
+                               /*
+                                * skip the page if it's not actually owned
+                                * by the anon (may simply be loaned to the
+                                * anon).
+                                */
+                               if ((pg->pqflags & PQ_ANON) == 0) {
+#ifdef DIAGNOSTIC
+                                       if (pg->uobject != NULL)
+                                               panic("uvm_map_clean: "
+                                                   "page anon vs. object "
+                                                   "inconsistency");
+#endif
+                                       uvm_unlock_pageq();
+                                       simple_unlock(&anon->an_lock);
+                                       continue;
+                               }
+
+#ifdef DIAGNOSTIC
+                               if (pg->uanon != anon)
+                                       panic("uvm_map_clean: anon "
+                                           "inconsistency");
+#endif
+
+                               /* zap all mappings for the page. */
+                               pmap_page_protect(PMAP_PGARG(pg),
+                                   VM_PROT_NONE);
+
+                               /* ...and deactivate the page. */
+                               uvm_pagedeactivate(pg);
+
+                               uvm_unlock_pageq();
+                               simple_unlock(&anon->an_lock);
+                               continue;
+
+                       case PGO_FREE:
+                               amap_unadd(&entry->aref, offset);
+                               refs = --anon->an_ref;
+                               simple_unlock(&anon->an_lock);



Home | Main Index | Thread Index | Old Index