Subject: Re: Supporting sector size != DEV_BSIZE
To: Darrin B. Jewell <jewell@mit.edu>
From: Trevin Beattie <trevin@xmission.com>
List: tech-kern
Date: 06/24/2002 23:13:57
At 11:43 PM 6/24/2002 -0400, Darrin B. Jewell wrote:
>
>Bill Studenmund <wrstuden@netbsd.org> writes:
>| > . units based on the ffs superblock (see FFS_DEV_BSIZE below)
>|
>| Note: those are file system blocks aka frags.
>
>I would like to carefully assert that my definition of FFS_DEV_BSIZE
>is explicitly not the file system fragment size. Under my definition,
I suspect Bill simply didn't look closely enough at your macro definitions.
The units I'm sure you're talking about is 2^(fs_fshift - fs_fsbtodb),
which I'm beginning to see originally meant "disk blocks" a.k.a. sectors,
but somewhere along the line got mixed up with DEV_BSIZEs.
>
>I also _always_ define the kernel constant DEV_BSIZE to be 512 and
>_never_ use a different value for it. By treating it as a fundamental
>constant that never changes and is never retrieved from persistent media,
>it becomes an independent unit.
Too bad not all implementations have treated it as such. :-/
>| > . units based on the disklabel d_secsize
>| > ( this should always match the hardware device)
>|
>| Note: the latter isn't necessarily true. If you take a disk image & move
>| it to another system, it may change. Folks wish to continue using the
>| disklabel number.
>
>This is why I mentioned it. I am not as adamant about this,
>but I was thinking that the in core value for this field
>should always match the hardware sector size.
I tend to agree with you on this point. For one thing, if the disk label's
sector size were smaller than the media's sector size, then I/O on the
device could fail when the kernel tries to read partial sectors. On the
other hand, d_secsize should be used in preference to the real sector size
(if different) when interpreting the units of other members of the disk
label, such as p_offset and p_size.
> Currently,
>the device strategy routines use d_secsize to interpret
>bp->b_blkno. If d_secsize does not match the hardware sector
>size, then the device strategy routines will need to be
>modified to do the appropriate conversion.
AFAIK, NetBSD's device strategy routines all use DEV_BSIZE to interpret
b_blkno. I think it was done to simplify working with the new buffer cache
and layered file systems. (Of course, that was all implemented while I
wasn't looking. :) It sounds like you're looking at things from a
different flavor of BSD. Which one?
>
>| > At the time, I found the following definitions useful:
>| >
>| > #define FFS_DEV_BSHIFT(fs) ((fs)->fs_fshift-(fs)->fs_fsbtodb)
>|
>| That should be a constant in the ufs mount structure (the in-kernel
>| thing). We don't need to subtract those constants every time; they aren't
>| going to change.
Well, in working on the LFS code, I figured that with five (!) different
block sizes it would take 20 macros to cover all the conversions, and 10
shift constants. I did add a few more shift constants to the in-kernel
structure, and a tried simplifying a few existing macros, but then I
discovered that some non-kernel code was using them, and those programs broke.
>As I mentioned, here is the partial
>dumpfs output from a nextstep 3.3 operating system distribution CD:
>
># dumpfs ns33cd.ufs | head -22
>file system: ns33cd.ufs
>endian big-endian
>magic 11954 time Sat Nov 12 00:44:21 1994
>id [ 0 0 ]
>cylgrp static inodes 4.2/4.3BSD fslevel 0 softdep disabled
>nbfree 1406 ndir 3168 nifree 71290 nffree 51
>ncg 45 ncyl 89 size 182272 blocks 176323
>bsize 8192 shift 13 mask 0xffffe000
>fsize 2048 shift 11 mask 0xfffff800
>frag 4 shift 2 fsbtodb 0
This is cool! It's much the same sort of layout I got from my 640MB
optical disk formatted by NeXTSTEP. And if it's truly a full ffs partition
on CD, then it proves a theory I had that one could create a ffs file
system with 2K sectors, burn it on a CD, and use it just like a regular disk.
Tell me, did you read this super block from sector 3? That's where it was
written on my optical disk. Oh, wait -- I can just read it off my own copy
of the NeXTSTEP 3.3 CD!
Hmmm... very interesting. There are actually _multiple_ disk labels here,
and except for the first one (on sector 0), they are not aligned on a
sector boundary. The cd_label_blkno field, which changes for each copy of
the disk label (it's the block # of the label), is given in terms of
512-byte blocks, NOT cd_secsize (# of bytes per sector).
Moving on to the root partition 'a'... well, that's supposed to start on
sector 0, but I don't see anything that resembles a super block... extra
disk labels on sectors 3, 7, and 11... this looks like an aout header on
sector 16... strings for a boot loader on sector 32... another boot loader
on sector 48... Ah, here it is, on sector 84. That's odd; I wonder where
they came up with that number?
-----------------------
Trevin Beattie "Do not meddle in the affairs of wizards,
trevin@xmission.com for you are crunchy and good with ketchup."
{:-> --unknown