On Fri, Mar 28, 2025 at 05:27:48PM +0100, Johnny Billquist wrote:
Here is the actual patch:
*** usr/src/sys/conf/boot/raboot.s.old Mon Aug 17 21:41:34 2009
--- usr/src/sys/conf/boot/raboot.s Mon Aug 17 22:44:12 2009
***************
*** 1,5 ****
--- 1,9 ----
/*
* SCCS id @(#)raboot.s 2.0 (2.11BSD) 4/13/91
+ *
+ * Code corrected as per the other primitive mscp drivers
+ * to handles other mscp controllers than DECs.
+ * /bqt - 20090817
*/
#include "localopts.h"
***************
*** 59,65 ****
MSCPSIZE = 64. / One MSCP command packet is 64bytes long (need 2)
! RASEMAP = 140000 / RA controller owner semaphore
RAERR = 100000 / error bit
RASTEP1 = 04000 / step1 has started
--- 63,69 ----
MSCPSIZE = 64. / One MSCP command packet is 64bytes long (need 2)
! RASEMAP = 100000 / RA controller owner semaphore
RAERR = 100000 / error bit
RASTEP1 = 04000 / step1 has started
***************
*** 153,170 ****
mov $RASEMAP,*$ra+RARSPH / set mscp semaphores
mov $RASEMAP,*$ra+RACMDH
mov *_bootcsr,r0 / tap controllers shoulder
! mov $ra+RACMDI,r0
1:
tst (r0)
! beq 1b / Wait till command read
! clr (r0)+ / Tell controller we saw it, ok.
2:
tst (r0)
! beq 2b / Wait till response written
clr (r0) / Tell controller we got it
rts pc
! icons: RAERR
ra+RARING
0
RAGO
--- 157,176 ----
mov $RASEMAP,*$ra+RARSPH / set mscp semaphores
mov $RASEMAP,*$ra+RACMDH
mov *_bootcsr,r0 / tap controllers shoulder
! mov $ra+RACMDH,r0
1:
tst (r0)
! bmi 1b / Wait till command read
! mov $ra+RARSPH,r0
2:
tst (r0)
! bmi 2b / Wait till response written
! mov $ra+RACMDI,r0
! clr (r0)+ / Tell controller we saw it, ok.
clr (r0) / Tell controller we got it
rts pc
! icons: RAERR + 033
ra+RARING
0
RAGO
So just out of curiosity, I took a look at the whole 2.11BSD rauboot.s
as I wanted to know what it is doing and what wisdom may be gleaned from
this patch. Not much, it seems, as it apparently fixes a different
problem.
But the initialization bits look similar:
RAERR = 100000 / error bit
RASTEP1 = 04000 / step1 has started
RAGO = 01 / start operation, after init
...
RARING = 8. / Ring base
...
/
/ RA initialize controller
/
mov $RASTEP1,r0
mov raip,r1
clr (r1)+ / go through controller init seq.
mov $icons,r2
1:
bit r0,(r1)
beq 1b
mov (r2)+,(r1)
asl r0
bpl 1b
...
icons: RAERR + 033
ra+RARING
0
RAGO
So it writes 0 into IP just once, and loops until the step 1 bit is set
in SA. Once there, it writes the values beginning at icons, each
corresponding to an initialization value for SA for each step, and waits
for each step bit by shifting RASTEP1.
Step 1: RAERR + 033 (100033)
Bit 15 needs to be 1, and RAERR does that, but it has nothing to
do with an error here. 033 corresponds to interrupt vector 154,
which is the default vector for the first MSCP controller. But
IE is 0, so it shouldn't matter. Ring length is 0 for both
commands and responses, corresponding to 2**0 == 1 entry each.
Step 2: ra + RARING
ra is the base of the communications area, but the controller
actually expects to be given the base of the response and
command descriptor rings, which are at +8 in the comm area.
That's the low 16 bit of the full Unibus or Qbus address.
Step 3: 0
That's the high bits of the full Unibus or Qbus address of the
comm area.
Step 4: RAGO
Set DMA burst = 0 (1 longword), request no "last fail" message,
and kick the controller into action.
So, 9 instructions of code plus 4 words of data to get the thing going.
Nice.
Anyway, I've re-read most of the UDA50 programming manual this
morning and I'd like to share a few things from Section 9.2:
(https://bitsavers.org/pdf/dec/disc/uda50/AA-L621A-TK_UnibusPortDescription_1982.pdf)
In the event of an initialization error, the port driver must retry
the sequence at least once. It is suggested, however, that a second
failure be considered as meaning that the port/controller is "down".
That's where the requirement for (at least) one retry comes from. We do
that only in udamatch(), assuming it won't ever be needed in udaattach().
I don't think that's necessarily a bad assumption, given that udamatch()
must have succeeded talking to the controller for us to ever reach
udaattach().
The host begins the initialization sequence either by issuing a bus
INIT or by writing any value to the IP register. The port must
guarantee that the host will read zeroes in SA on the next bus cycle.
Initialization then sequences through Steps 1-4 as described on the
following pages.
So we're kinda expected to read SA=0 once before we get to Step 1.
From the host's viewpoint, Step n is deemed to have begun when reading
SA shows the transition Sn 0-->1. Of course, Step n ends when Step
n+1 begins as just defined. This transition from Step n to Step n+1
may be accompanied by an interrupt, depending on whether interrupts
are enabled.
Obviously the transition to Step 1 cannot cause an interrupt, but then
we're not using interrupts anyway despite enabling them.
Steps 1-3 each are required to complete within 10 seconds. If any of
these steps fails to complete within that period, this is to be
treated as a host-detected fatal error.
This is where the 10s timeout in mscp_waitstep() comes from.
During initialization, the host must wait 100 microseconds after any
interrupt before reading the SA register to see if there was an error.
This is because the port may use the SA register to deliver the vector
address to the processor interrupt sequence. If it does, then time
will be required by the port to set SA to the value to be read by the
host initialization code.
We're probably good on that as mscp_waitstep() waits 10ms. Except for
the first read of SA, which is done with no delay. That's probably worth
fixing, just in case.
This pattern should appear within 100 microseconds after the
hard-initialize.
This is about the Step 1 bit in SA appearing following a write to IP.
We're currently waiting the whole 10s if it doesn't appear, which
shouldn't do any harm but seems unnecessary. Also, this is where the CMD
controller is failing to react.
Upon receipt of the above data the port/controller begins running its
integrity check diagnostics. When finished, the port conditionally
interrupts the host as described above. If enabled, the interrupt
will take place whether the diagnositics succeeded or failed.
Step 1 must complete within 10 seconds after the host writes to the SA
register. The completion will result in an interrupt if IE was set to
one in Step 1.
This is what we expect to have happened towards the end of udamatch()
before we return 1, or as the comment says: "should have interrupted by
now". Since we waited for SA to indicate transition to Step 2, we can be
sure that the interrupt has happened by now.
So, I'm not sure this helps much with our problems with uda(4) on CMD
controllers, but I found it interesting nonetheless. The system with the
CMD controller will be offline and unreachable until Friday, so I won't
be able to conduct any more experiments until then.
Hans