Subject: Re: FFS journal
To: Kirill Kuvaldin <kirill.kuvaldin@gmail.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 07/03/2006 21:08:13
On Sun, Jul 02, 2006 at 07:59:50PM +0400, Kirill Kuvaldin wrote:
> [...]
> Optional:
>
> * Support for batching transactions:
> - it may be a significant performance win.
I fear that not batching transactions may have a significant impact on
performances, especially when compared to a softdep FFS.
> * API Documentation:
> - it may be helpful for the developers to understand what the
> journaling code does and how to use it.
I don't think this can be optional.
>
> III. TECHNICAL DETAILS
>
> * Journal internals:
> - The journal area (or log area) used to write journal entries is a
> fixed data allocated at filesystem initialization. The filesystem
> superblock must maintain a reference to the journal area which also
> contains its own superblock where some sort of necessary information
> is stored. Two indices (start_index and end_index) that point to the
> start and end of the active area of the journal that is used in
> circular fashion, simply mark the bounds of the journal that contain
> active transactions.
>
> +-------------+------------------------+------------------------+------+
> | Journal | | | |
> | superblock | t r a n s a c t i o n | t r a n s a c t i o n | |
> |+-----------+|+-------+ +-------+ |+-------+ +-------+ | |
> ||start_index||| | | | || | | | | |
> ||end_index ||| | | | ... || | | | ... | .... |
> || ... ||| | | | || | | | | |
> |+-----------+|+-------+ +-------+ |+-------+ +-------+ | |
> +-------------+------------------------+------------------------+------+
> Figure 1: Journal area on-disk representation
Shouln't this have some constraints with disk sector bountaries ?
Note it's a shoot in the dark, I've just been thinking about this when seeing
this figure ...
>
> * Journaling API:
> The following ideas inspired from the BeFS textbook (see [2]). Although,
> there are only 3 functions for journal management, it may be enough for
> the rest part of filesystem to interact with journaling code.
>
> - jffs_start_transaction():
> o acquire the journal semaphore, holding it under the transactions
> completes;
> o ensure that there is enough space available in the journal to hold
> this transaction and in case there is - make some preparation
> actions and allocate the necessary transaction structures;
> otherwise, to force flushing blocks out of the cache, preferably
> those that were part of previous transactions;
> o set the state of transaction to *running* allowing the filesystem
> code to add new blocks to form the transaction structure.
>
> - jffs_write_blocks():
> o during a transaction any code that modifies a block of the
> filesystem metadata must call this function on the modified data;
> o for the sake of performance it may be possible to modify only the
> in-memory journal structures and later flush them to the log.
>
> - jffs_end_transaction():
> o at first this function turns a transaction into the *locked* state,
> meaning that no more block can be added to the transaction;
> o write all in-memory transaction blocks to their appropriate places
> into the journal area. When the last block is written to the
> journal, the transaction is considered to be *finished*;
> o set the callback function that will change the transaction state to
> *completed* as soon as the journal entry will be completely flushed
> to disk;
> o release the journal semaphore.
>
> * Journaling constraints on the cache subsystem:
> - journaling code must be able to lock disk blocks in the cache to
> prevent them from being flushed.
> - journaling code must know when a disk block is flushed to disk. It
> may be achived with callback functions if cache subsystem supports
> them. When the last block forming the transaction is flushed to
> disk, the transaction considered to be completed.
To me it looks like this needs to hook into the softdep code. Softdep
has more or less the same constraints.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--