Subject: RE: problems regarding libc
To: Pai-Hsiang Hsiao <shawn@eecs.harvard.edu>
From: TAKEMURA, Shin <takemura@netbsd.org>
List: port-hpcmips
Date: 12/24/1999 12:10:19
-----Original Message-----
From: Pai-Hsiang Hsiao <shawn@eecs.harvard.edu>
To: TAKEMURA, Shin <takemura@netbsd.org>
Cc: port-hpcmips@netbsd.org <port-hpcmips@netbsd.org>
Date: Friday, December 24, 1999 8:23 AM
Subject: RE: problems regarding libc
>> I wrote a simple program, which call bzero(1MB) 100 times and
>> ran it within time command on my MC-R500. The time command
>> says it takes 6.5 sec, so I roughly got 16MB/s. Is that enough?
>>
>> (MC-R500 has VR4111 of which clock is 78MHz or 100MHz.)
>
>I don't know what's the theoretical bandwidth of both machine, could
>you please try use the C version bcopy included in the libc
>(libsa/string/)
>and see what's the difference?
Sorry, I couldn't find any differences.
>It might be better for you to test
>different
>size of block copy, to avoid biased cache effect.
I tried and got follows:
bzero speed on MC-R700:
1KB: 69.3MB/s
4KB: 87.9MB/s
8KB: 88.6MB/s
16KB: 16.2MB/s
32KB: 16.3MB/s
64KB: 16.3MB/s
128KB: 16.3MB/s
256KB: 16.3MB/s
512KB: 16.2MB/s
1024KB: 16.2MB/s
bzero speed on my celeron 300A PC:
1KB: 1075.6MB/s
4KB: 1910.4MB/s
8KB: 2169.5MB/s
16KB: 2285.7MB/s
32KB: 853.3MB/s
64KB: 656.4MB/s
128KB: 301.9MB/s
256KB: 255.5MB/s
512KB: 215.5MB/s
1024KB: 210.5MB/s
>The difference here is huge (4MB/s v.s. 160MB/s), though the benchmark
>(hbench-OS, a patched version of lmbench) program itself might have
>problems.
>
>My conjecture is, 16MB/s is still far from enough. If it's a 78MHz machine
>with 32 bit memory bus, the memory (write/copy) bandwidth should not be
>as low as 16MB/s. That's why I believe there're some problems in the
library's
>bzero/bcopy. FYI, Another Pentium III 350MHz under test has more than
>1000MB/s memory bandwidth.
Do you talk about 1st cache access speed? Pentium III 350MHz system
has memory of which clock is 100MHz. Thereby maximum memory
bandwidth is 32bit x 100MHz = 400MB/s. And my celeron 300A PC's
read/write speed is about 200MB/s effectively.
My R500's CPU clock is 78MHz and I think that it's bus width is 16bit.
Generally speaking, MIPS machine's memory bus clock is less than
half of the CPU bus lock. Thus maximun speed is 78MHz/2 x 16bit = 78MB/s
and effective speed may be 20-40MB/s. I think a speed of 16MB/s is so so.
(My argument may be something wrong because I'm not hardware
engineer and I don't know about memory system very much.)
But 4MB/s is too slow.
The kernel happens to make a page uncachable to avoid virtual alias
problem. If it happened, a memory access speed dramatically worsen.
I wonder that it would be cause of your problem.
Takemura