Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/share/man/man9 Add wapbl(9) man page.



details:   https://anonhg.NetBSD.org/src/rev/cfdd08c2b37f
branches:  trunk
changeset: 336898:cfdd08c2b37f
user:      riastradh <riastradh%NetBSD.org@localhost>
date:      Thu Mar 26 21:38:49 2015 +0000

description:
Add wapbl(9) man page.

diffstat:

 share/man/man9/wapbl.9 |  442 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 442 insertions(+), 0 deletions(-)

diffs (truncated from 446 to 300 lines):

diff -r 18aaa520fb22 -r cfdd08c2b37f share/man/man9/wapbl.9
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/share/man/man9/wapbl.9    Thu Mar 26 21:38:49 2015 +0000
@@ -0,0 +1,442 @@
+.\"    $NetBSD: wapbl.9,v 1.1 2015/03/26 21:38:49 riastradh Exp $
+.\"
+.\" Copyright (c) 2015 The NetBSD Foundation, Inc.
+.\" All rights reserved.
+.\"
+.\" This code is derived from software contributed to The NetBSD Foundation
+.\" by Taylor R. Campbell.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+.\" POSSIBILITY OF SUCH DAMAGE.
+.\"
+.Dd March 26, 2015
+.Dt WAPBL 9
+.Os
+.Sh NAME
+.Nm WAPBL ,
+.Nm wapbl_start ,
+.Nm wapbl_stop ,
+.Nm wapbl_begin ,
+.Nm wapbl_end ,
+.Nm wapbl_flush ,
+.Nm wapbl_discard ,
+.Nm wapbl_add_buf ,
+.Nm wapbl_remove_buf ,
+.Nm wapbl_resize_buf ,
+.Nm wapbl_register_inode ,
+.Nm wapbl_unregister_inode ,
+.Nm wapbl_register_deallocation ,
+.Nm wapbl_jlock_assert ,
+.Nm wapbl_junlock_assert
+.Nd write-ahead physical block logging for file systems
+.Sh SYNOPSIS
+.In sys/wapbl.h
+.Vt typedef void (*wapbl_flush_fn_t)(struct mount *, daddr_t *, int *, int) ;
+.Ft int
+.Fn wapbl_start "struct wapbl **wlp" "struct mount *mp" "struct vnode *devvp" \
+        "daddr_t off" "size_t count" "size_t blksize" \
+        "struct wapbl_replay *wr" \
+        "wapbl_flush_fn_t flushfn" "wapbl_flush_fn_t flushabortfn"
+.Ft int
+.Fn wapbl_stop "struct wapbl *wl" "int force"
+.Ft int
+.Fn wapbl_begin "struct wapbl *wl" "const char *file" "int line"
+.Ft void
+.Fn wapbl_end "struct wapbl *wl"
+.Ft int
+.Fn wapbl_flush "struct wapbl *wl" "int wait"
+.Ft void
+.Fn wapbl_discard "struct wapbl *wl"
+.Ft void
+.Fn wapbl_add_buf "struct wapbl *wl" "struct buf *bp"
+.Ft void
+.Fn wapbl_remove_buf "struct wapbl *wl" "struct buf *bp"
+.Ft void
+.Fn wapbl_resize_buf "struct wapbl *wl" "struct buf *bp" "long oldsz" \
+       "long oldcnt"
+.Ft void
+.Fn wapbl_register_inode "struct wapbl *wl" "ino_t ino" "mode_t mode"
+.Ft void
+.Fn wapbl_unregister_inode "struct wapbl *wl" "ino_t ino" "mode_t mode"
+.Ft void
+.Fn wapbl_register_deallocation "struct wapbl *wl" "daddr_t blk" "int len"
+.Ft void
+.Fn wapbl_jlock_assert "struct wapbl *wl"
+.Ft void
+.Fn wapbl_junlock_assert "struct wapbl *wl"
+.Sh DESCRIPTION
+.Nm ,
+or
+.Em write-ahead physical block logging ,
+is an abstraction for file systems to write physical blocks in the
+.Xr buffercache 9
+to a bounded-size log first before their real destinations on disk.
+The name means:
+.Bl -tag -width "physical block" -offset abcd
+.It logging
+batches of writes are issued atomically via a log
+.It physical block
+only physical blocks, not logical file system operations, are stored in
+the log
+.It write-ahead
+blocks are written to the log before being written to the disk
+.El
+.Pp
+When a file system using
+.Nm
+issues writes (as in
+.Xr bwrite 9
+or
+.Xr bdwrite 9 Ns ),
+they are grouped in batches called
+.Em transactions
+in memory, which are serialized to be consistent with program order
+before
+.Nm
+submits them to disk atomically.
+.Pp
+Thus, within a transaction, after one write, another write need not
+wait for disk I/O, and if the system is interrupted, e.g. by a crash or
+by power failure, either both writes will appear on disk, or neither
+will.
+.Pp
+When a transaction is full, it is written to a circular buffer on
+disk called the
+.Em log .
+When the transaction has been written to disk, every write in the
+transaction is submitted to disk asynchronously.
+Finally, the file system may issue new writes via
+.Nm
+once enough writes submitted to disk have completed.
+.Pp
+After interruption, such as a crash or power failure, some writes
+issued by the file system may not have completed.
+However, the log is written consistently with program order and before
+file system writes are submitted to disk.
+Hence a consistent program-order view of the file system can be
+attained by resubmitting the writes that were successfully stored in
+the log using
+.Xr wapbl_replay 9 .
+This may not be the same state just before interruption -- writes in
+transactions that did not reach the disk will be excluded.
+.Pp
+For a file system to use
+.Nm ,
+its
+.Xr VFS_MOUNT 9
+method should first replay any journal on disk using
+.Xr wapbl_replay 9 ,
+and then, if the mount is read/write, initialize
+.Nm
+for the mount by calling
+.Fn wapbl_start .
+The
+.Xr VFS_MOUNT 9
+method should call
+.Fn wapbl_stop .
+.Pp
+Before issuing any
+.Xr buffercache 9
+writes, the file system must lock the current
+.Nm
+transaction with
+.Fn wapbl_begin ,
+which may sleep until there is room in the transaction for new writes.
+After issuing the writes, the file system must unlock the transaction
+with
+.Fn wapbl_end .
+Either all writes issued between
+.Fn wapbl_begin
+and
+.Fn wapbl_end
+will complete, or none of them will.
+File systems can assert that the transaction should be locked with
+.Fn wapbl_jlock_assert ,
+or unlocked, with
+.Fn wapbl_junlock_assert .
+.Pp
+If a file system requires multiple transactions to initialize an
+inode, and needs to destroy partially initialized inodes during replay,
+it can register them by
+.Vt ino_t
+inode number before initialization with
+.Fn wapbl_register_inode
+and unregister them with
+.Fn wapbl_unregister_inode
+once initialization is complete.
+.Nm
+does not actually concern itself whether the objects identified by
+.Vt ino_t
+values are
+.Sq inodes
+or
+.Sq quaggas
+or anything else -- file systems may use this to list any objects keyed
+by
+.Vt ino_t
+value in the log.
+.Pp
+When a file system frees resources on disk and issues writes to reflect
+the fact, it cannot then reuse the resources until the writes have
+reached the disk.
+However, as far as the
+.Xr buffercache 9
+is concerned, as soon as the file system issues the writes, they will
+appear to have been written.
+So the file system must not attempt to reuse the resource until the
+current
+.Nm
+transaction has been flushed to disk.
+.Pp
+The file system can defer freeing a resource by calling
+.Fn wapbl_register_deallocation
+to record the disk address of the resource and length in bytes of the
+resource.
+Then, when
+.Nm
+next flushes the transaction to disk, it will pass an array of the disk
+addresses and lengths in bytes to a file-system-supplied callback.
+(Again,
+.Nm
+does not care whether the
+.Sq disk address
+or
+.Sq length in bytes
+is actually that; it will pass along
+.Vt daddr_t
+and
+.Vt int
+values.)
+.Sh FUNCTIONS
+.Bl -tag -width abcd
+.It Fn wapbl_start wlp mp devvp off count blksize wr flushfn flushabortfn
+Start using
+.Nm
+for the file system mounted at
+.Fa mp ,
+storing a log of
+.Fa count
+disk sectors at disk address
+.Fa off
+on the block device
+.Fa devvp
+writing blocks in units of
+.Fa blksize
+bytes.
+On success, stores an opaque
+.Vt "struct wapbl *"
+cookie in
+.Li * Ns Fa wlp
+for use with the other
+.Nm
+routines and returns zero.
+On failure, returns an error number.
+.Pp
+If the file system had replayed the log with
+.Xr wapbl_replay 9 ,
+then
+.Fa wr
+must be the
+.Vt "struct wapbl_replay *"
+cookie used to replay it, and
+.Fn wapbl_start
+will register any inodes that were in the log as if with
+.Fn wapbl_register_inode ;
+otherwise
+.Fa wr
+must be
+.Dv NULL .
+.Pp
+.Fa flushfn
+is a callback that
+.Nm
+will invoke as
+.Fa flushfn Ns Li ( Fa mp Ns Li , Fa deallocblks Ns Li , Fa dealloclens Ns Li , Fa dealloccnt Ns Li )
+just before it flushes a transaction to disk, with the transaction
+locked exclusively, where
+.Fa mp
+is the mount point passed to
+.Fn wapbl_start ,
+.Fa deallocblks
+is an array of
+.Fa dealloccnt
+disk addresses, and
+.Fa dealloclens
+is an array of
+.Fa dealloccnt
+lengths, corresponding to the addresses and lengths the file system
+passed to
+.Fn wapbl_register_deallocation .
+If flushing the transaction to disk fails,
+.Nm
+will call
+.Fa flushabortfn
+with the same arguments to undo any effects that
+.Fa flushfn
+had.
+.It Fn wapbl_stop wl force
+Flush the current transaction to disk and stop using



Home | Main Index | Thread Index | Old Index