Subject: Re: fxp0: Device timeouts on AlphaServer 400
To: Frederik Meerwaldt <frederik@freddym.org>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: port-alpha
Date: 02/16/2003 09:14:32
On Sun, Feb 16, 2003 at 05:39:02PM +0100, Frederik Meerwaldt wrote:
> I just increased nmbclusters to 1024, hoping that this would have been
> the solution for the "fxp0: Device timeout" problem but unfortunately
> it wasn't.
>
> As soon as I am causing more or less heavy NFS traffic I keep on
> getting this message.
>
> Any further ideas/suggestions?
The Intel i82557 family of Ethernet chips depends on being able to
atomically update 16-bit words in memory, which the Alpha cannot do.
Specifically, in order for the chip to see a new transmit descriptor,
the driver must clear a bit in the *previous* command descriptor (which
has already been given to the chip). The bit is in a 16-bit word who's
adjacent 16-bit word is the descriptor status word. Updating the control
word clobbers the status word on platforms like the Alpha, thus creating
a race condition whereby the driver could fail to notice the "completed"
bit in the status word.
--
-- Jason R. Thorpe <thorpej@wasabisystems.com>