Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/sys/uvm adjust heuristics for read-ahead to skip the full re...



details:   https://anonhg.NetBSD.org/src/rev/80a79632f1d6
branches:  trunk
changeset: 832600:80a79632f1d6
user:      jdolecek <jdolecek%NetBSD.org@localhost>
date:      Sat May 19 15:18:02 2018 +0000

description:
adjust heuristics for read-ahead to skip the full read-ahead when last page of
the range is already cached; this speeds up I/O from cache, since it avoids
the lookup and allocation overhead

on my system I observed 4.5% - 15% improvement for cached I/O - from 2.2 GB/s to
2.3 GB/s for cached reads using non-direct UBC, and from 5.6 GB/s to 6.5 GB/s
for UBC using direct map

part of PR kern/53124

diffstat:

 sys/uvm/uvm_readahead.c |  23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)

diffs (51 lines):

diff -r e2b6c532803d -r 80a79632f1d6 sys/uvm/uvm_readahead.c
--- a/sys/uvm/uvm_readahead.c   Sat May 19 15:13:26 2018 +0000
+++ b/sys/uvm/uvm_readahead.c   Sat May 19 15:18:02 2018 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: uvm_readahead.c,v 1.9 2018/03/30 07:22:59 mlelstv Exp $        */
+/*     $NetBSD: uvm_readahead.c,v 1.10 2018/05/19 15:18:02 jdolecek Exp $      */
 
 /*-
  * Copyright (c)2003, 2005, 2009 YAMAMOTO Takashi,
@@ -40,7 +40,7 @@
  */
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: uvm_readahead.c,v 1.9 2018/03/30 07:22:59 mlelstv Exp $");
+__KERNEL_RCSID(0, "$NetBSD: uvm_readahead.c,v 1.10 2018/05/19 15:18:02 jdolecek Exp $");
 
 #include <sys/param.h>
 #include <sys/pool.h>
@@ -125,6 +125,23 @@
 
        DPRINTF(("%s: uobj=%p, off=%" PRIu64 ", endoff=%" PRIu64 "\n",
            __func__, uobj, off, endoff));
+
+       /*
+        * Don't issue read-ahead if the last page of the range is already cached.
+        * The assumption is that since the access is sequential, the intermediate
+        * pages would have similar LRU stats, and hence likely to be still in cache
+        * too. This speeds up I/O using cache, since it avoids lookups and temporary
+        * allocations done by full pgo_get.
+        */
+       mutex_enter(uobj->vmobjlock);
+       struct vm_page *pg = uvm_pagelookup(uobj, trunc_page(endoff - 1));
+       mutex_exit(uobj->vmobjlock);
+       if (pg != NULL) {
+               DPRINTF(("%s:  off=%" PRIu64 ", sz=%zu already cached\n",
+                   __func__, off, sz));
+               return endoff;
+       }
+
        off = trunc_page(off);
        while (off < endoff) {
                const size_t chunksize = RA_IOCHUNK;
@@ -147,7 +164,7 @@
 
                mutex_enter(uobj->vmobjlock);
                error = (*uobj->pgops->pgo_get)(uobj, off, NULL,
-                   &npages, 0, VM_PROT_READ, UVM_ADV_RANDOM, 0);
+                   &npages, 0, VM_PROT_READ, UVM_ADV_RANDOM, PGO_NOTIMESTAMP);
                DPRINTF(("%s:  off=%" PRIu64 ", bytelen=%zu -> %d\n",
                    __func__, off, bytelen, error));
                if (error != 0 && error != EBUSY) {



Home | Main Index | Thread Index | Old Index