port-macppc: Re: 1 Ghz CPU in AGP G4 causes NetBSD 1.6.2 hangs

Subject: Re: 1 Ghz CPU in AGP G4 causes NetBSD 1.6.2 hangs
To: None <port-macppc@NetBSD.org>
From: Tim Kelly <hockey@dialectronics.com>
List: port-macppc
Date: 09/28/2004 15:58:32
At 5:20 PM +0000 9/28/04, John Klos wrote:

>It looks like this one:
>
>http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/powerpc/mpc6xx/?hideattic=0&on
>ly_with_tag=netbsd-1-6#dirlist
>http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/powerpc/mpc6xx/Attic/cpu_subr.
>c?hideattic=0&only_with_tag=netbsd-1-6
>
>1.6.2 was before the move to oea from mpc6xx.

Thanks. This has the code that I suspected was present. Don did some
testing with the bootloader I added L2 cache configuration support to. I am
back to questioning if the L2 cache on his system is _not_ being properly
configured.

According to his dmesg,
Sep 26 14:53:31 temp /netbsd: cpu0: 256KB L2 cache, 2MB L3 backside cache

However, if the above code was used, that 256k value is not retrieved from
L2CR but is posited as that value:

if (l2cr & L2CR_L2E) {
    if (vers == MPC7450 || vers == MPC7455) {
           u_int l3cr;
           printf(": 256KB L2 cache");
           __asm __volatile("mfspr %0,%1" :
                            "=r"(l3cr) : "n"(SPR_L3CR) );
    if (l3cr & L3CR_L3E)
      printf(", %cMB L3 backside cache",
                        l3cr & L3CR_L3SIZ ? '2' : '1');
    printf("\n");
    return;
}

Don's tests with my modified/experimental bootloader additions showed that
the L2CR value after Open Firmware but before the kernel to be 0x80000000:

OpenBSD/Macppc boot
Configuring L1/L2/L3 caches
l2cr val before is 80000000
l2 enabled, test cache size
autosizing
memory claimed (0x200000 at 0x100000)
mem test set 80440000
changeL2Setting returned 80000000

(it hangs at this point due to an undiagnosed bug with late model G4s)

Additionally, in his OF tree:

0 > dev /cpus/PowerPC,G4@0 ls .properties
ff83dcc8: /l2-cache
ff83df00:   /l2-cache

name                    PowerPC,G4

device_type             cpu
reg                     00000000
cpu-version             80010201
state                   running
clock-frequency         3b9aca00
bus-frequency           05f03e4d
timebase-frequency      017c0f93
reservation-granule-size00000020
tlb-sets                00000040
tlb-size                00000080
d-cache-size            00008000
i-cache-size            00008000
d-cache-sets            00000080
i-cache-sets            00000080
i-cache-block-size      00000020
d-cache-block-size      00000020
graphics
performance-monitor
altivec
data-streams
l2-cache                ff83dcc8
l2cr                    80000000
existing                00000000 80000000 80000000 80000000
available               00003000 7fffd000 d0000000 20000000
translations            00000000 00003000 00000000 00000010 80000000
00080000 80000000 00000028
                        80080000 00001000 80080000 00000028 80081000
00001000 80081000 00000028
                        80082000 00001000 80082000 00000028 f0000000
00010000 f0000000 00000028
                        f0800000 00001000 f0800000 00000028 f0c00000
00001000 f0c00000 00000028
                        f2000000 00010000 f2000000 00000028 f2800000
00001000 f2800000 00000028
                        f2c00000 00001000 f2c00000 00000028 f4000000
00010000 f4000000 00000028
                        f4800000 00001000 f4800000 00000028 f4c00000
00001000 f4c00000 00000028
                        f5200000 00200000 f5200000 00000028 f5200000
00200000 f5200000 00000028
                        ... 00000150 bytes total

0 > dev /cpus/PowerPC,G4@0/l2-cache ls .properties
ff83df00: /l2-cache

name                    l2-cache
device_type             cache
i-cache-size            00040000
d-cache-size            00040000
i-cache-sets            00000200
d-cache-sets            00000200
i-cache-line-size       00000040
d-cache-line-size       00000040
cache-unified
clock-frequency         1dcd6500
l2-cache                ff83df00

 ok
0 > dev /cpus/PowerPC,G4@0/l2-cache/l2-cache ls .properties

name                    l2-cache
device_type             cache
i-cache-size            00200000
d-cache-size            00200000
i-cache-sets            00000800
d-cache-sets            00000800
i-cache-line-size       00000080
d-cache-line-size       00000080
cache-unified
clock-frequency         0ee6b280

The default L2CR value appears to be just enough to enable the L2 cache,
but has absolutely no information about cache size, parity, SRAM type,
clock or write through. The code snippet above would see the L2 cache as
enabled, but doesn't do any further examination of the values to ensure
they are valid.

The reason I am familiar with this is because I have an Old World Mac with
a ZIF Carrier card fitted with a 300MHz G3 w/ 512k L2 cache, and OF does
not know how to enable the cache. As I was doing a lot of work getting
OpenBSD to work on this Mac, I got tired of how _slow_ the G3 was without
the cache on  - even the decompressing of the RAMDISK kernel was terribly
slow. It took me about two weeks, but I figured out how to test the cache
size and then make relatively conservative guesses about the rest. The
background is at http://www.dialectronics.com/bootloader. It works real
well on G3s and correctly calculates sizes for 7400 G4s, but doesn't work
so good on later G4s or ones with special stuff for SRAM. I haven't had
enough access to G4 CPUs to get a full range of the behavior. NetBSD fairly
flies with it enabled (but it is currently being used for another purpose).

My thoughts are that the AGP G4 Don has does not have enough OF code to
walk the node and extract the information properly, but more recent
versions of MacOS have patches for it and can do this. This allows the
upgraded CPU to run properly in MacOS, but leads to problems in NetBSD
because that code isn't present. That's pure speculation, but if the above
code is what is in 1.6.2, my thoughts would be that there's not enough to
properly configure the L2 cache - and 0 for size is 0M L2 on certain G3s
(740s) and 2M L2 on G4. Only 256k in the D+I caches on his upgrade card are
valid and the rest overflows LIFO. Networking is doing more data caching, I
would guess, and maybe this is why the problem occurs more often here?

Disclaimer: I could be completely wrong. It's been known to happen. Frequently.

tim