Subject: Re: AS1200 problems
To: David Hopper <dhop@nwlink.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: port-alpha
Date: 01/31/2002 22:54:30
On Thu, Jan 31, 2002 at 03:42:14PM -0800, David Hopper wrote:
> I have been having frequent memory management faults on my AlphaServer 1200
> 5/533. I've been tracking NetBSD-current on this particular tincup since
> December '99. I _think_ I can trace the faults back to around the time of
> the multiprocessor changes.
>
> The memory management faults occur with seemingly random processes, but
> most often when I increase the load on the machine: it's happened most
> frequently with imapd (2001a from pkgsrc), moving a large directory from a
> RAID device to another SCSI HD, and more recently, and reproducibly, on a
> build.sh on various processes soon after the libs have been installed into
> destdir. The machine can go for a month if I leave it to its
> imap/webserving duties, or it faults daily if I am trying to sup & rebuild.
>
> I thought it was the RAM, so I tried a build using brand-new Crucial memory
> today. Same mm fault:
>
> CPU0 trap entry = 0x2 (memory management fault)
> CPU0 a0 = 0x0
> CPU0 a1 = 0x1
> CPU0 a2 = 0xffffffffffffffff
> CPU0 pc = 0x0
> CPU0 ra = 0x0
> CPU0 pv = 0xfffffc000037d260
> CPU0 curproc = 0xfffffc00rf98da00
> CPU0 pid = 8452, comm = cpp0
>
> Then a hang on the sync to disk.
>
> I'm at wit's end, but I absolutely will not install any other BSD. I know
> Jason uses an AS1200 for the multiproc stuff, so I can't imagine that he's
> got the same troubles...
Yah, strange ... I put my AS1200 through a lot of work. It's a paragon
of stability, in fact (I do a lot of gcc-current and binutils-current
testing on it).
It's really weird that both $pc and $ra are 0. It's almost like you
have a trashed stack (restored bogus value into $ra, and then executed
a ret, putting that bogus value into $pc).
Can you use gdb to tell me what $pv points to?
--
-- Jason R. Thorpe <thorpej@wasabisystems.com>