Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Sizeof int_fastN_t vs uint_fastN_t data types



On Mon, 18 Nov 2024 10:54:45 +0100
Martin Husemann <martin%duskware.de@localhost> wrote:

> On Mon, Nov 18, 2024 at 09:00:59AM +0000, Sad Clouds wrote:
> > I would expect discrepancy between OSes, e.g. Linux aarch64 defines
> > fast versions for 16 and 32 bit integers at 8 bytes, NetBSD aarch64
> > defines fast versions for 16 and 32 bit integers at 4 bytes. However,
> > the same OS defining different sizes for signed and unsigned fast
> > integers seems a bit strange. Is there really that much overhead for
> > signed integer arithmetic on sparc64?
> 
> Registers are 8 byte and sign extension is (relatively) expensive.
> There are only native arithmetic instuctions for 4 byte and 8 byte
> values (and the 4 byte versions may still need sign extension at the
> end).
> 
> You can try and look at the differences in the assembler code generated
> for _fast vs. non-fast variants of the same code. Compilers have gotten
> incredible smart and I would expect the optimized code to be not much
> different in most cases, but it would be interesting to compare that for some
> real-world cases where the _fast variants are used.
> 
> Martin

Hi, thanks for the info. I think you're right about one set of 8 byte
registers. Although there appear to be differences between Solaris
and NetBSD on sparc64. Which is what got me wondering...

Solaris:
--- Sizes Of Integral Types In Bytes ---
Data type               Exact    Least    Fast    
--------------------- : ---------------------------
int8_t/uint8_t        : 1/1      1/1      1/1     
int16_t/uint16_t      : 2/2      2/2      4/4     
int32_t/uint32_t      : 4/4      4/4      4/4     
int64_t/uint64_t      : 8/8      8/8      8/8     

NetBSD:
--- Sizes Of Integral Types In Bytes ---
Data type               Exact    Least    Fast    
--------------------- : ---------------------------
int8_t/uint8_t        : 1/1      1/1      8/1     
int16_t/uint16_t      : 2/2      2/2      8/2     
int32_t/uint32_t      : 4/4      4/4      8/4     
int64_t/uint64_t      : 8/8      8/8      8/8     

Looking at the assembly instructions below, the 32-bit function has
extra SRA instruction. This may be slightly slower compared to 64-bit
version. So looks like NetBSD int_fastN_t definitions may be correct.

$ cat test_int32.c
#include <stdio.h>
#include <stdint.h>

int32_t test_int32(volatile int32_t n1, volatile int32_t n2)
{
        return n1 + n2;
}

int main(void)
{
        printf("%d\n", test_int32(1, 1));
}

$ cat test_int64.c
#include <stdio.h>
#include <stdint.h>

int64_t test_int64(volatile int64_t n1, volatile int64_t n2)
{
        return n1 + n2;
}

int main(void)
{
        printf("%d\n", test_int64(1, 1));
}


$ gcc -m64 -O2 -o test_int32 test_int32.c
$ gcc -m64 -O2 -o test_int64 test_int64.c

$ dis -F test_int32 test_int32
test_int32()
    test_int32:             d0 23 a8 7f  st        %o0, [%sp + 0x87f]
    test_int32+0x4:         d2 23 a8 87  st        %o1, [%sp + 0x887]
    test_int32+0x8:         d0 03 a8 7f  ld        [%sp + 0x87f], %o0
    test_int32+0xc:         c2 03 a8 87  ld        [%sp + 0x887], %g1
    test_int32+0x10:        90 02 00 01  add       %o0, %g1, %o0
    test_int32+0x14:        81 c3 e0 08  retl
    test_int32+0x18:        91 3a 20 00  sra       %o0, 0x0, %o0

$ dis -F test_int64 test_int64
test_int64()
    test_int64:             d0 73 a8 7f  stx       %o0, [%sp + 0x87f]
    test_int64+0x4:         d2 73 a8 87  stx       %o1, [%sp + 0x887]
    test_int64+0x8:         d0 5b a8 7f  ldx       [%sp + 0x87f], %o0
    test_int64+0xc:         c2 5b a8 87  ldx       [%sp + 0x887], %g1
    test_int64+0x10:        81 c3 e0 08  retl
    test_int64+0x14:        90 02 00 01  add       %o0, %g1, %o0




Home | Main Index | Thread Index | Old Index