Subject: Re: Corrupt data when reading filesystems under Linux guest
To: Jed Davis <jdev@panix.com>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: port-xen
Date: 06/14/2005 15:32:42
On Mon, Jun 13, 2005 at 03:58:07AM +0000, Jed Davis wrote:
> In article <d8aovu$a59$1@sea.gmane.org>, Jed Davis <jdev@panix.com> wrote:
> > In article <d88qle$s6r$1@sea.gmane.org>, Jed Davis <jdev@panix.com> wrote:
> > >
> > > So I might even be able to fix this myself, if no-one more knowledgeable
> > > is working on it.
> >
> > And I think I have
>
> No, not really. That is, the patch I sent has, I'm pretty sure, a
> serious bug: if a pool allocation or xen_shm fails, xbdback_io will bail
> out after having potentially already enqueued several IOs. I think it
> won't mismanage memory, but it will still defer an xbd request while
> also performing part of it and them sending the guest a completion
> message. This is wrong in a number of ways.
I think we can deal with this by sending a partial xfer to the drive, it
shouldn't break anything. But obviously the completion message has to be
sent only one the whole request has been handled.
>
> And the other changes I'm making don't, so far as I know, sidestep this
> issue. I think I'll have to chain the actual IOs together, toss them
> if a pool_get fails, run them all at the end of the segment loop, and
> adjust the xenshm callback to match. Except that the callback can fail
> to install for want of memory, it looks like. That's... annoying.
If it's really an issue, we can preallocate the xen_shm_callback_entry in
the xbdback_request and adjust the xen_shm interface for this. This would,
at last, fix this issue.
--
Manuel Bouyer, LIP6, Universite Paris VI. Manuel.Bouyer@lip6.fr
NetBSD: 26 ans d'experience feront toujours la difference
--