Subject: port-sparc/35363: 3.1_STABLE problems on dual nocache 50 MHz SuperSPARC (390Z50)
To: None <port-sparc-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: None <johan@giantfoo.org>
List: netbsd-bugs
Date: 01/05/2007 21:55:00
>Number: 35363
>Category: port-sparc
>Synopsis: MP broken for some 50 MHz SuperSPARC (390Z50) processors on 3.1
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-sparc-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jan 05 21:55:00 +0000 2007
>Originator: Johan A. van Zanten
>Release: NetBSD 3.1_STABLE
>Organization:
Hail Eris!
>Environment:
System: NetBSD vishnu 3.1_STABLE NetBSD 3.1_STABLE (MANGOLASSI.MP) #0: Sun Nov 5 17:37:17 CST 2006 johan@pangu:/tew/003/src/NetBSD/NetBSD-3/src/sys/arch/sparc/compile/MANGOLASSI.MP sparc
Architecture: sparc
Machine: sparc
>Description:
When the second processor is activated multi-processor 3.1_STABLE sun4m
systems, NetBSD behaves erratically, with some programs seg faulting. The
system is extremely unstable and not usable, but it does not panic.
Please see my email to port-sparc:
http://mail-index.netbsd.org/port-sparc/2007/01/02/0000.html
Confirmation of the problem by Michael-John Turner:
http://mail-index.netbsd.org/port-sparc/2007/01/02/0002.html
Original report:
http://mail-index.netbsd.org/port-sparc/2006/12/29/0004.html
Please note that not all 50 MHz SuperSPARC processors trigger the problem.
I have the same OS building running without problems on a dual "390Z55"
system. My understanding is that a significant difference between CPUs
identified as "390Z50" and "390Z55" is that the '55 has 1 MB of ecache per
CPU, and the '50 has none.
See: http://mbus.sunhelp.org/modules/index.htm#super
Also, please note that i had a dual 390Z50 system running NetBSD
2.0.2_STABLE without problems, under significant load (Internet-connected
DNS server and MX, as well as KDC), before i upgraded to 3.1_STABLE.
Michael-John Turner's message above also suggests that the problem may
have been introduced between 2.x and 3.x.
>How-To-Repeat:
Install a NetBSD 3.1 MP kernel (GENERIC.MP produces the problem) on a
sparc with two 390Z50 50 MHz processors.
Boot the system multi-user.
>Fix:
Removing the second CPU eliminates the problem, as does switching to
multiple CPUs of a different type, such as "390Z55" or other, faster SPARC
CPUs, all of which appear to have cache (and a cache controller).
>Unformatted: