Re: kern/51241: USB Drive detect failure on Tegra K1

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,cyber%netbsd.org@localhost
Subject: Re: kern/51241: USB Drive detect failure on Tegra K1
From: mlelstv%serpens.de@localhost (Michael van Elst)
Date: Sun, 19 Jun 2016 08:20:01 +0000 (UTC)

The following reply was made to PR kern/51241; it has been noted by GNATS.

From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: kern/51241: USB Drive detect failure on Tegra K1
Date: Sun, 19 Jun 2016 08:16:43 +0000 (UTC)

 cyber%netbsd.org@localhost (Erik Berls) writes:

 >armv7# umass0 at uhub1 port 1 configuration 1 interface 0
 >umass0: Western Digital My Book 1230, rev 2.10/10.65, addr 2
 >scsibus0 at umass0: 2 targets, 2 luns per target
 >sd0 at scsibus0 target 0 lun 0: <WD, My Book 1230, 1065> disk fixed
 >sd0(umass0:0:0:0): not ready, data = 00 00 00 00 04 01 00 00 00 00
 >sd0: drive offline
 >sd0: fabricating a geometry

 >armv7# fdisk sd0
 >^C^C^C^C

 This proved to be an issue that is not USB related but a deadlock
 between scsipi and wedge discovery.

 scsipi (which is used by umass) creates a thread (e.g. scsibus0) that
 is used to complete scsi requests that return an error. However, before
 doing so, it probes the scsibus synchronously for targets by calling
 scsibus_config().

 The target discovery attaches scsi target drivers like sd(4),
 and the sdattach routine then scans for wedges by calling
 dkwedge_discover().

 This is where things get stuck. dkwedge_discover() accesses
 the target (to read things like a GPT) which issues scsi commands.
 This all works fine if there are no errors, but in this
 case the external USB drive answers an early access attempt
 with an error 4/1 "Logical Unit Is in Process Of Becoming Ready".
 Error handling is queued for the completion thread, but the
 thread is still doing the bus probing.

 To solve this we need either to

   decouple scsibus probing from the error handling thread (it was
   put there to avoid deadlocks during autoconf, but I'm not sure if
   that is still relevant, see scsipi_base.c 1.79).

 or

   decouple wedge discovery from device attachment (which works fine
   for all other devices and may create race conditions between
   device attachment and access to wedges).

 Additionally we need to support devices that need time to become
 ready, either by waiting in autoconf() or polling asynchronously.
 The latter probably causes some unpredictable device numbers of
 wedge devices (dkX), but for wedges that wouldn't be a new issue.

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv%serpens.de@localhost
                                 "A potential Snark may lurk in every tree."

Prev by Date: Re: bin/51230 ('gpt biosboot' needs to mark protective mbr partition as 'active')
Next by Date: PR/51252 CVS commit: src/sys
Previous by Thread: Re: kern/51241: USB Drive detect failure on Tegra K1
Next by Thread: Re: pkg/50357: warning message from anita
Indexes:

Home | Main Index | Thread Index | Old Index