tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ataraid(4) missing disk handling



On Tue, Mar 11, 2025 at 2:17 AM Christoph Badura <bad%bsd.de@localhost> wrote:
>
> On Mon, Mar 10, 2025 at 04:43:30PM +0200, Andrius V wrote:
> > Taylor has a good analysis and explanation of the issue
> > https://mail-index.netbsd.org/netbsd-bugs/2024/03/26/msg082202.html,
> > which I also noticed by testing ATA RAID setup on VIA controllers.
>
> > [...]
>
> Good analysis.
>
> > Given that three statuses are defined for adi_status
> > (https://nxr.netbsd.org/xref/src/sys/dev/ata/ata_raidvar.h#75), I
> > probably need to check if any of the flags are defined ((adi_status &
> > (ADI_S_ONLINE | ADI_S_ASSIGNED | ADI_S_SPARE)) instead
> > (https://netbsd.org/~andvar/ata_raid_fix2.diff).
>
> The first part seems fine.  Although I didn't look at what the difference
> between vn_bdev_open() vs. the deleted code is.
>

The major difference is that vn_bdev_open() increments  (*vpp)->v_writecount++;
(https://nxr.netbsd.org/xref/src/sys/kern/vfs_vnops.c#1535).

That prevents from the v_writecount >  0 assert failure on vn_close()
(https://nxr.netbsd.org/xref/src/sys/dev/ata/ld_ataraid.c#249)

My only concern was that on successful path incremented value will
stay equal 1 and if it may cause unexpected later on.
According to Taylor ld(4) doesn't have detach, so it supposed to be
fine. I also tested the code in various scenarios
and I didn't observe any issues However, if this concern would stay,
alternative is to replace vn_close() with VOP_CLOSE().
Downside that VOP_CLOSE() requires locks to be set before using it.

> In the second part this bit is weird for me to read:
>
> +               else
> +                       vp = NULL;
>                 if (vp == NULL) {
>
> I'd write it as:
>
> -               vp = ata_raid_disk_vnode_find(adi);
> +               vp = NULL;
> +               if (adi->adi_status &
> +                   (ADI_S_ONLINE | ADI_S_ASSIGNED | ADI_S_SPARE))
>                 if (vp == NULL) {
>                         /*
>                          * XXX This is bogus.  We should just mark the
>
> But Taylor seemed to be fine with it if I remember the discussion on ICB
> correctly.  A matter of style I guess.
>
I can apply this suggestion, looks better to me too.

> > Another alternative is to check that adi->adi_dev IS NULL as Taylor
> > proposed in his analysis thread.
>
> But that proposal never explained how an adv with adi->adi_dev==NULL ended
> up on the ataraid_disk_vnode_list which isn't supposed to happen!
>
> --chris


Home | Main Index | Thread Index | Old Index