NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/51241: USB Drive detect failure on Tegra K1
The following reply was made to PR kern/51241; it has been noted by GNATS.
From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: kern/51241: USB Drive detect failure on Tegra K1
Date: Sun, 19 Jun 2016 08:16:43 +0000 (UTC)
cyber%netbsd.org@localhost (Erik Berls) writes:
>armv7# umass0 at uhub1 port 1 configuration 1 interface 0
>umass0: Western Digital My Book 1230, rev 2.10/10.65, addr 2
>scsibus0 at umass0: 2 targets, 2 luns per target
>sd0 at scsibus0 target 0 lun 0: <WD, My Book 1230, 1065> disk fixed
>sd0(umass0:0:0:0): not ready, data = 00 00 00 00 04 01 00 00 00 00
>sd0: drive offline
>sd0: fabricating a geometry
>armv7# fdisk sd0
>^C^C^C^C
This proved to be an issue that is not USB related but a deadlock
between scsipi and wedge discovery.
scsipi (which is used by umass) creates a thread (e.g. scsibus0) that
is used to complete scsi requests that return an error. However, before
doing so, it probes the scsibus synchronously for targets by calling
scsibus_config().
The target discovery attaches scsi target drivers like sd(4),
and the sdattach routine then scans for wedges by calling
dkwedge_discover().
This is where things get stuck. dkwedge_discover() accesses
the target (to read things like a GPT) which issues scsi commands.
This all works fine if there are no errors, but in this
case the external USB drive answers an early access attempt
with an error 4/1 "Logical Unit Is in Process Of Becoming Ready".
Error handling is queued for the completion thread, but the
thread is still doing the bus probing.
To solve this we need either to
decouple scsibus probing from the error handling thread (it was
put there to avoid deadlocks during autoconf, but I'm not sure if
that is still relevant, see scsipi_base.c 1.79).
or
decouple wedge discovery from device attachment (which works fine
for all other devices and may create race conditions between
device attachment and access to wedges).
Additionally we need to support devices that need time to become
ready, either by waiting in autoconf() or polling asynchronously.
The latter probably causes some unpredictable device numbers of
wedge devices (dkX), but for wedges that wouldn't be a new issue.
--
--
Michael van Elst
Internet: mlelstv%serpens.de@localhost
"A potential Snark may lurk in every tree."
Home |
Main Index |
Thread Index |
Old Index