Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Sizeof int_fastN_t vs uint_fastN_t data types
On Mon, 18 Nov 2024 10:54:45 +0100
Martin Husemann <martin%duskware.de@localhost> wrote:
> On Mon, Nov 18, 2024 at 09:00:59AM +0000, Sad Clouds wrote:
> > I would expect discrepancy between OSes, e.g. Linux aarch64 defines
> > fast versions for 16 and 32 bit integers at 8 bytes, NetBSD aarch64
> > defines fast versions for 16 and 32 bit integers at 4 bytes. However,
> > the same OS defining different sizes for signed and unsigned fast
> > integers seems a bit strange. Is there really that much overhead for
> > signed integer arithmetic on sparc64?
>
> Registers are 8 byte and sign extension is (relatively) expensive.
> There are only native arithmetic instuctions for 4 byte and 8 byte
> values (and the 4 byte versions may still need sign extension at the
> end).
>
> You can try and look at the differences in the assembler code generated
> for _fast vs. non-fast variants of the same code. Compilers have gotten
> incredible smart and I would expect the optimized code to be not much
> different in most cases, but it would be interesting to compare that for some
> real-world cases where the _fast variants are used.
>
> Martin
Hi, thanks for the info. I think you're right about one set of 8 byte
registers. Although there appear to be differences between Solaris
and NetBSD on sparc64. Which is what got me wondering...
Solaris:
--- Sizes Of Integral Types In Bytes ---
Data type Exact Least Fast
--------------------- : ---------------------------
int8_t/uint8_t : 1/1 1/1 1/1
int16_t/uint16_t : 2/2 2/2 4/4
int32_t/uint32_t : 4/4 4/4 4/4
int64_t/uint64_t : 8/8 8/8 8/8
NetBSD:
--- Sizes Of Integral Types In Bytes ---
Data type Exact Least Fast
--------------------- : ---------------------------
int8_t/uint8_t : 1/1 1/1 8/1
int16_t/uint16_t : 2/2 2/2 8/2
int32_t/uint32_t : 4/4 4/4 8/4
int64_t/uint64_t : 8/8 8/8 8/8
Looking at the assembly instructions below, the 32-bit function has
extra SRA instruction. This may be slightly slower compared to 64-bit
version. So looks like NetBSD int_fastN_t definitions may be correct.
$ cat test_int32.c
#include <stdio.h>
#include <stdint.h>
int32_t test_int32(volatile int32_t n1, volatile int32_t n2)
{
return n1 + n2;
}
int main(void)
{
printf("%d\n", test_int32(1, 1));
}
$ cat test_int64.c
#include <stdio.h>
#include <stdint.h>
int64_t test_int64(volatile int64_t n1, volatile int64_t n2)
{
return n1 + n2;
}
int main(void)
{
printf("%d\n", test_int64(1, 1));
}
$ gcc -m64 -O2 -o test_int32 test_int32.c
$ gcc -m64 -O2 -o test_int64 test_int64.c
$ dis -F test_int32 test_int32
test_int32()
test_int32: d0 23 a8 7f st %o0, [%sp + 0x87f]
test_int32+0x4: d2 23 a8 87 st %o1, [%sp + 0x887]
test_int32+0x8: d0 03 a8 7f ld [%sp + 0x87f], %o0
test_int32+0xc: c2 03 a8 87 ld [%sp + 0x887], %g1
test_int32+0x10: 90 02 00 01 add %o0, %g1, %o0
test_int32+0x14: 81 c3 e0 08 retl
test_int32+0x18: 91 3a 20 00 sra %o0, 0x0, %o0
$ dis -F test_int64 test_int64
test_int64()
test_int64: d0 73 a8 7f stx %o0, [%sp + 0x87f]
test_int64+0x4: d2 73 a8 87 stx %o1, [%sp + 0x887]
test_int64+0x8: d0 5b a8 7f ldx [%sp + 0x87f], %o0
test_int64+0xc: c2 5b a8 87 ldx [%sp + 0x887], %g1
test_int64+0x10: 81 c3 e0 08 retl
test_int64+0x14: 90 02 00 01 add %o0, %g1, %o0
Home |
Main Index |
Thread Index |
Old Index