Subject: Re: VOP_BMAP question
To: None <tech-kern@netbsd.org>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 12/24/2003 01:24:19
On Tue, Dec 23, 2003 at 04:12:38PM -0800, Bill Studenmund wrote:
> On Tue, Dec 23, 2003 at 07:09:53PM +0100, Juergen Hannken-Illjes wrote:
> > On Tue, Dec 23, 2003 at 09:55:37AM -0800, Bill Studenmund wrote:
> > > On Fri, Dec 19, 2003 at 10:13:22PM +0100, Juergen Hannken-Illjes wrote:
> > > > How does VOP_BMAP() handle fragments?
> > > >
> > > > Given a file with holes obtained from ftruncate(), what does VOP_BMAP()
> > > > return in its argument "bnp" if it finds a fragment?
>
> Holes are usually a problem. VOP_BMAP() isn't good for triggering
> allocation, which is what you need to fill a hole.
>
> > > > Is it the block number of the fragment or will it return (daddr_t)-1?
>
> It'll be the fragment's address.
>
> > > > Is it always ok to write a full block to "bnp"?
>
> If you mean ffs block, I don't think so. I think you have to know that you
> won't be writing past the end of the file.
This is the restriction. The last block from VOP_BMAP may be a fragment if
the the last block exceeds the file's size. If it doesn't exceed the file's size
it is alway a full block.
> > > I think you've been bitten by an ffs ambiguity (since only ffs has
> > > "fragments").
> > >
> > > What ffs calls a fragment in its documentation (the 1k in an 8k/1k file
> > > system) is what the kernel internally calls a block. Since VOP_BMAP()
> > > deals with kernel things, a "fragment" is a block, so there is no problem.
> >
> > So ufs_bmaparray() first sets "maxrun = MAXPHYS / mp->mnt_stat.f_iosize - 1"
> > which is "64k / 8k - 1 == 7" from example above. Then it computes "*runp" as
> > the number of 1k blocks (fragments) that are contiguous.
Here I was wrong, "*runp" is the number of (8k) blocks. From ufs_issequential():
return (daddr0 + ump->um_seqinc == daddr1);
ump->um_seqinc is the number of fragments in a block. Now I am sure the snippet
from sys/dev/vnd.c is correct.
> > >From sys/dev/vnd.c:
> >
> > bsize = vnd->sc_vp->v_mount->mnt_stat.f_iosize;
> > ...
> > error = VOP_BMAP(vnd->sc_vp, bn / bsize, &vp, &nbn, &nra);
> > ...
> > sz = (1 + nra) * bsize;
> >
> > This looks like it would run on "blocks" instead of "fragments".
>
> Having a run size of ffs blocks does not mean that the block number
> returned is also in units of ffs blocks.
> ffs will read f_iosize blobs up until the end of the file, so it's
> appropriate for sz to be in f_iosize blobs.
>
> I've been quite confused by the code, so I'm not really sure if the bn /
> bsize is right; it might really need to be bn / f_bsize (which is
> "fragment" size.
>
> Take care,
>
> Bill
--
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)