tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: long double losing mantissa bits
> Date: Tue, 19 Jan 2021 22:10:56 +0100
> From: Rhialto <rhialto%falu.nl@localhost>
>
> The loss of precision probably occurs at the line
>
> // The following big literal is 2 to the 56th power:
> ufrac = (uint64_t) (frac * 72057594037927936.0);
>
> where frac is in the range [ 0.5 , 1.0 >, so I would expect that
> multiplying with 2**56 is perfectly feasable (just changes the
> exponent). Debugging output shows that differences in the lsbits that
> were detectable before (when printed with %La), were no longer after.
In the following test case:
input: 72057594037927937 -> scanf: 72057594037927937.000000
frac: 0.500000 0x8.00000000000008p-4 sexp: 57
uexp: b9
ufrac: 0080000000000000
56 : 00ffffffffffffff
Unexpected result: 5c80 0000 0000 0000
expected : 5c80 0000 0000 0001
056200: sign: 0 uexp: b9 ufrac: 00 0000 0000 0000
Here frac = 0x8.00000000000008p-4 = (1 + 2^56)/2^57, so
frac*72057594037927936 = frac * 2^56 = 0x8.00000000000008p52 =
0x80000000000000.8p52.
This is not an integer, so conversion to uint64_t rounds it to
nearest, with ties to even, so you get ufrac = 0x80000000000000 =
36028797018963968 as shown in the result.
It looks like you have an off-by-one error in your exponent handling.
If you want to scale the fractional part into an integer, you need to
multiply by 2^57, not by 2^56.
Home |
Main Index |
Thread Index |
Old Index