Re: I/O bus reset to fix CMD MSCP controllers (and probably others)

To: Johnny Billquist <bqt%softjar.se@localhost>, port-vax%netbsd.org@localhost
Subject: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
From: Anders Magnusson <ragge%tethuvudet.se@localhost>
Date: Sat, 29 Mar 2025 09:11:25 +0100

I took a quick look at it.

udamatch() only does the initial steps to see if there is an uda at all,then leave the rest to the mscp bus routines.

The MP_STEP1 parts you quote below is just that. It resets the uda andsees if it gets any answer.If it succeeds then STEP1MASK etc is checked later on to walk throughthe initialization process.

I do not know where Robert's CMD controller fails, but as some of youhave written there is a logic error in udamatch() where it do not retryif nothing is found the first time. Hans, have you tried to change theudamatch() routines to try init multiple times?

Should also note that when this code was written around 30 years ago Idid not have access to all the documentation that is available today.

So there is most likely errors in how it is implemented :-)

-- R


Den 2025-03-29 kl. 02:23, skrev Johnny Billquist:

Actually, udamatch() confuse me. I don't understand how it is expectedto deal with step 3 and 4. And we have a proper initialization inmscp/mscp_subr.c in mscp_init(), which also walks through all theinitialization steps.


I honestly don't understand the thinking behind that code...

  Johnny

On 2025-03-29 02:18, Johnny Billquist wrote:

Hmm. I haven't read through all the code, but I at least see someproblems.


In the initialization, the code looks like this:

         bus_space_write_2(mi.mi_iot, mi.mi_iph, 0, 0); /* Start init */
         if (mscp_waitstep(&mi, MP_STEP1, MP_STEP1) == 0)
                 return 0; /* Nothing here... */

and so on for the next step. The problem is that mscp_waitstep thenonly checks that the controller moves to the next step, but cannotdetect if the controller indicates any error.The first MP_STEP1 really should be ALLSTEPS, and there should besome code to do a reset for a second try in case you see an errorcondition.

But actually, even more proper should be to use STEP1MASK and compareagainst STEP1GOOD, and so on... There are all these nice valuesdefined in mscp/mscpreg.h, but then they are not used, and we havethis half- broken code instead. I wonder how that happened...?


   Johnny

On 2025-03-28 18:59, Hans Rosenfeld wrote:

On Fri, Mar 28, 2025 at 05:27:48PM +0100, Johnny Billquist wrote:


Here is the actual patch:

*** usr/src/sys/conf/boot/raboot.s.old  Mon Aug 17 21:41:34 2009
--- usr/src/sys/conf/boot/raboot.s      Mon Aug 17 22:44:12 2009
***************
*** 1,5 ****
--- 1,9 ----
   /*
    *    SCCS id @(#)raboot.s    2.0 (2.11BSD)   4/13/91
+  *
+  * Code corrected as per the other primitive mscp drivers
+  * to handles other mscp controllers than DECs.
+  * /bqt - 20090817
    */
   #include "localopts.h"

***************
*** 59,65 ****

MSCPSIZE = 64. / One MSCP command packet is 64bytes long(need 2)


! RASEMAP       =       140000  / RA controller owner semaphore

   RAERR =               100000  / error bit
   RASTEP1 =     04000   / step1 has started
--- 63,69 ----

MSCPSIZE = 64. / One MSCP command packet is 64bytes long(need 2)


! RASEMAP       =       100000  / RA controller owner semaphore

   RAERR =               100000  / error bit
   RASTEP1 =     04000   / step1 has started
***************
*** 153,170 ****
         mov     $RASEMAP,*$ra+RARSPH    / set mscp semaphores
         mov     $RASEMAP,*$ra+RACMDH
         mov     *_bootcsr,r0            / tap controllers shoulder
!       mov     $ra+RACMDI,r0
   1:
         tst     (r0)
!       beq     1b                      / Wait till command read

! clr (r0)+ / Tell controller we sawit, ok.

   2:
         tst     (r0)
!       beq     2b                      / Wait till response written
         clr     (r0)                    / Tell controller we got it
         rts     pc

! icons:        RAERR
         ra+RARING
         0
         RAGO
--- 157,176 ----
         mov     $RASEMAP,*$ra+RARSPH    / set mscp semaphores
         mov     $RASEMAP,*$ra+RACMDH
         mov     *_bootcsr,r0            / tap controllers shoulder
!       mov     $ra+RACMDH,r0
   1:
         tst     (r0)
!       bmi     1b                      / Wait till command read
!       mov     $ra+RARSPH,r0
   2:
         tst     (r0)
!       bmi     2b                      / Wait till response written
!       mov     $ra+RACMDI,r0

! clr (r0)+ / Tell controller we sawit, ok.

         clr     (r0)                    / Tell controller we got it
         rts     pc

! icons:        RAERR + 033
         ra+RARING
         0
         RAGO

Anyway, not sure if this helps, since now we're in PDP-11assembler. But

maybe it gives a bit of an idea what the problem was.


I've actually looked at that and tried to understand it when I was
looking into this issue. The PDP-11 assembly doesn't scare my, I've
written my fair share of it and I'm still comfortable reading it. Too
bad the patch doesn't show the definiton of RACMDI and RARSPH, and I'm
too lazy to google that. Maybe I'll boot the 11/73 later this weekend
and look at the full code.

What I did read was the MSCP programming document for the UDA50 that's
on Bitsavers.

But if someone points me at the specific code in NetBSD, I can tryto see if
it's a similar kind of issue.


The problem is in sys/dev/qbus/uda.c, in particular in udamatch(). All
that udamatch() wants to do is to go through the first initialization
steps to cause an interrupt.

The state of the controller when udamatch() is running is that it has
been used already by VMB and boot to get the kernel loaded.

The UDA50 register interface really consists only of two registers, IP
and SA. Writing anything into IP should cause a initialization sequence
to be started, with SA indicating Step1 shortly after. If it doesn't,
udamatch() should try one more time, but currently doesn't. It only
retries the initialization if Step1 was reached and we then fail to
reach Step2.

The first thing I did was checking that the CSR was mapped correctly,
that the physical addresses where what was expected. I also read and
wrote the registers directly at the VMB console. It would have been

surprising if anything was wrong there, as the same code works justfine

when the controller hasn't been touched since the last I/O bus reset
since we've booted off the network.

One of the things I did as an experiment was have udamatch() write 0

into SA, and then read it once per second until something happened.Most

of the time, SA would have the error bit set after a few seconds. From
there, writing 0 into IP would kick off a controller initialization and
get SA to indicate Step1 2s later. But as I said, this incurs a boot
delay around 12s.


Hans

Follow-Ups:
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld

References:
- I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Anders Magnusson
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Anders Magnusson
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Johnny Billquist

Prev by Date: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Next by Date: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Previous by Thread: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Next by Thread: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Indexes:

Home | Main Index | Thread Index | Old Index