Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: I/O bus reset to fix CMD MSCP controllers (and probably others)



On Sat, Mar 29, 2025 at 09:11:25AM +0100, Anders Magnusson wrote:
> The MP_STEP1 parts you quote below is just that.  It resets the uda and sees
> if it gets any answer.
> If it succeeds then STEP1MASK etc is checked later on to walk through the
> initialization process.
> 
> I do not know where Robert's CMD controller fails,

After the writing IP, repeated reads of SA read 0. I've put in a loop
after writing IP which reads SA once a second up to 60s and sometimes
there was no reaction at all until the retry.

> but as some of you have
> written there is a logic error in udamatch() where it do not retry if
> nothing is found the first time.  Hans, have you tried to change the
> udamatch() routines to try init multiple times?

Yes, of course. That was the first thing I changed, retry after we
missed Step1. Then the controller would sometimes wake up in time in the
2nd try. But we'll have waited for 10s in mscp_waitstep() already for
the first try, and then it takes a few more seconds in the 2nd try.

> Should also note that when this code was written around 30 years ago I did
> not have access to all the documentation that is available today.
> So there is most likely errors in how it is implemented :-)

Another thing I did early on was changing the code of udamatch() to call
mscp_waitstep() with ALLSTEPS as mask. And I added error checking to
mscp_waitstep(). That, and a bunch of printfs to tell me what's going on
and how long we've waited. None of that really made much of a difference,
but it helped some to understand when the controller state actually
changed:

[   1.0000000] uba0 at mainbus0: Q22
[   1.0000000] udamatch: Init SA = 0 (0 secs)
[   1.0000000] udamatch: Init SA = 0 (1 secs)
[   1.0000000] udamatch: Init SA = 0 (2 secs)
[   1.0000000] udamatch: Init SA = 0 (3 secs)
[   1.0000000] udamatch: Init SA = 0 (4 secs)
[   1.0000000] udamatch: Init SA = 0 (5 secs)
[   1.0000000] udamatch: Init SA = 0 (6 secs)
[   1.0000000] udamatch: Init SA = 0 (7 secs)
[   1.0000000] udamatch: Init SA = 0 (8 secs)
[   1.0000000] udamatch: Init SA = 8006 (9 secs)
[   1.0000000] mscp_waitstep: SA = 8006 (count 0)
[   1.0000000] udamatch: nothing here
[   1.0000000] udamatch: Init SA = 0 (0 secs)
[   1.0000000] udamatch: Init SA = 0 (1 secs)
[   1.0000000] udamatch: Init SA = b00 (2 secs)
[   1.0000000] mscp_waitstep: SA = b00 (count 0)
[   1.0000000] mscp_waitstep: SA = 10ad (count 1)
[   1.0000000] uda0 at uba0 csr 172150 vec 774 ipl 17

I experimented a bit with how long udamatch() would repeatedly read SA.
Sometimes SA would indicate an error (8006) after 9s, sometimes it
wouldn't do anything for 60s. On the 2nd try it would usually be
detected as shown above, but not always. I figured that maybe it just
needs to be poked twice to reinitialize, but just writing IP twice with
a short delay in between also wasn't enough to get it going reliably.


Hans


-- 
%SYSTEM-F-ANARCHISM, The operating system has been overthrown


Home | Main Index | Thread Index | Old Index