Subject: Re: SCSI
To: None <thorpej@nas.nasa.gov>
From: Elmar Kolkman <kolkmae@la1.apd.dec.com>
List: port-hp300
Date: 01/28/1997 07:59:38
This was mailed to me by Jason Thorpe, but I'll reply to other messages I got
too.
> On Mon, 27 Jan 1997 08:45:43 +0100 (CET)
> "Elmar Kolkman" <kolkmae@la1.apd.dec.com> wrote:
>
> > I've tried a bit more, and I'm sure it ISN'T the SCSI code. I will attach
> > the full boot log from SCSI at the end of this file, but I've also tried by
> > removing ALL SCSI hardware, including the controller, from my system. It
> > still hangs with the 'old' 1.2 kernel (I didn't have a 1.2b prerelease
> > kernel) when netbooting from my linux-machine.
>
> Thanks, your stack trace is _very_ helpful... From looking at the
> info you've provided, I see where the problem is, and I'm fairly sure
> I know what's causing it... More below..
>
> > > Well, I know the DCM driver works, since I'm using it to ppp to my ISP,
> > > so I can type this mail :-)
> >
> > But then again, your machine at least starts, which I cann't say about mine.
> > ;-)
>
> Yes, but it's worth noting, I'm not using the DCM as the console (I'm
> using a Catseye framebuffer).
I thought so. I would love to try that too. But I've only a HIL connector, no
connector for a screen.
>
> > OK, but to make the debugging a bit easier, I will copy the whole booting
> > process, so you (all) see the rest of the process too. Maybe it is some
> > setting, because I don't have any documentation on this machine...
> > (I removed some '^H ^H stuff...)
>
> Cool... (BTW, that self-test is _really_ cool looking, with all the
> serial MUX entries :-)
Hm. And that's just the part of it. In real life I have four more of them
connected... ;-)
>
> > dca0 at scode 9 ipl 5 flags 0x1: no fifo
> > dcm0 at scode 10 ipl 3 flags 0
> > dcm1 at scode 11 ipl 3 flags 0xetrap: bad kernel read access at 0x6e
>
> ...ok, I'm assuming that the console is on dcm1? Can you tell
> me _exactly_ which board the console is on? (I'm assuming it's on
> the port marked "console", since that's the only one the remote bit
> affects :-)
Nope. It's dcm0 I'm using as for the console port.
But, after I've made a cable to connect my HP2624a terminal to the HP9000 on
the small RS-232 connector, I'll try it without any dcm's. Or is it possible
to connect it to a PC's RS-232 connector with a straight 9-pin RS-232 cable ?
(I must have been tired yesterday to not think of that possibility). What I
mean: has this RS-232 connector a HOST or a SLAVE pinout.
(This information isn't yet on the FAQ-page. What I discovered about my
hardware so far, I'll mail to its maintainer).
>
> Ok, here's a quick tutorial on using this kind of information...
>
> Note the address in the "trap" message:
>
> trap: bad kernel read access at 0x6e
>
> 0x6e is in the first "page" (i.e. it's less than 0x1000). This page
> is not mapped ... i.e. the pte for this page doesn't have the PG_V bit
> set. This causes dereferences of NULL pointers to cause the trap
> you're seeing (i.e. it's designed to catch bugs :-).
Sounds obvious.
> So, what that has told you is that you attempted to deref NULL. This
*I* didn't do anything. It goes wrong, even if I am at least 5 meters away
from the console... ;-<
> is the kernel equivalent of getting a SIGSEGV (and, like catching SIGSEGV
> in a user program, it's fatal).
>
> > trap type 8, code = 0x402074d, v = 0x6e
> > kernel program counter = 0xa6c0a
> > kernel: MMU fault trap
> > panic: MMU fault
> > Stopped at _Debugger+0x6: unlk a6
> > db> trace
> > _Debugger(200ac,a0ccf,144cdc,2304,144d0c) + 6
> > _panic(a0ccf,1,1,eb4c4,3) + 34
> > _trap(8,402074d,6e) + 21a
> > _addrerr(?)
> > _dcmxint(eb4c4,1,12,0,0) + 10c
>
> Ok...this is the part of the stack trace that tells you where the
> problem occurred. Basically, you ere in the function dcmxint()
> when an address error occurred; the CPU jumps to that function
> when an invalid address is used.
>
> Ok, so, if you look at the dcmxint() function (sys/arch/hp300/dev/dcm.c,
> line 898), it's pretty clear what's happening...
I would love to, but I would need the sources. I could, of course, set up a
cross compile environment on my Linux-box (I saw some mailing on that too,
this week).
>
> You're getting an interrupt, and you're dereferencing "tp", which
> is NULL... it's NULL because the port hasn't yet been opened, which
> means the tty structure hasn't been allocated yet.
>
> "Oops!" :-)
Well, it's kind of usual for me to find this kind of bugs. Don't know if
it's me or my hardware... Happened with my PC too.
>
> So, I have a question for you... "Do you have XON/XOFF flow control
> enabled on the terminal you're using?"
Yes. Both on the HP2624a and in minicom I used XON/XOFF.
>
> If you do, please try disabling it, and tell me if that helps. In the
> mean time, I'll look for the nicest way to fix that bug...
I'll try this as soon as I'm home.
> I may need to send you a kernel or two to netboot, for testing, as well.
>
> > _dcmpint(eb4c4,1,1) + 2c
> > _dcmintr(eb4c4,219df,2004,a20000,144e84) + de
> > _isrdispatch(6c) + 7a
> > _intrhand(?)
> > _dcmselftest(eb52e,c8ca8,eb52e,28) + a
>
> ...hmm, and given that this is in the trace... I decided to look and see
> what it does, and I've found a couple of slight bugs in it... *sigh*
This shouldn't be happening if I don't have any dcm's ?
> > _dcmattach(c8c7c,e8d88,9263e,de45c,de46c) + 82
> > _find_device(de45c) + 15e
> > _configure(c,ff801000,fffffffc,13a000,ffeffffc) + 92
> > _cpu_startup(c992c,c,ff801000,fffffffc,13a000) + 2f2
> > _main() + 4a
> > _main() + 4a
> > db>
> > ----- End of minicom.cap ----
Some more information: It happens too when I have only one dcm. But it
occurs in the SCSI drivers then. That was the original reason to post it on
this thread.
And: I have only 1 terminal connected: the console. The rest of the ports are
currently empty.
Elmar
--
Alp = 1) One of a number of ski mountains in Europe
2) A shouted request for assistance made by a European skier in
America. An appropriate reply is "What's Zermatter ?".
Henry Beard & Roy McKie
This mail was brought to you by:
Elmar Kolkman.
He can be reached as 'kolkmae@apd.dec.com' or 'elmar@usn.nl'