NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
toolchain/50940: cc -ffast-math fails to disable denormals on amd64
>Number: 50940
>Category: toolchain
>Synopsis: cc -ffast-math fails to disable denormals on amd64
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: toolchain-manager
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Mar 11 14:45:00 +0000 2016
>Originator: Andreas Gustafsson
>Release: NetBSD 7.0
>Organization:
>Environment:
System: NetBSD 7.0
Architecture: x86_64
Machine: amd64
>Description:
When the -ffast-math option is enabled, gcc generates x86_64 floating
point code that assumes that the FTZ (Flush To Zero) and DAZ
(Denormals Are Zero) bits in the MXCSR register are set.
On at least some non-NetBSD platforms, -ffast-math also causes the
program to link in a runtime support module "crtfastmath.o" that sets
FTZ and DAZ, but this does not happen on NetBSD. As a result, the
generated code that is relying on FTZ and DAZ being set can yield
incorrect results for denormal input.
The specific issue I ran into was calculating an array of square roots
of single-precision floats. For this, gcc -O3 -ffast-math managed to
generate vectorized SSE code using the rsqrtps instruction followed by
a Newton-Raphson step to calculate four square roots in one go. This
code returns incorrect results (-infinity rather than zero) for denormal
inputs when DAZ is not set.
I have not checked if NetBSD ports other than amd64 are also affected.
A work-around is to execute the following early in the program, before
doing floating-point math:
fenv_t fe;
fegetenv(&fe);
fe.mxcsr |= 0x8040; // Set DAZ and FTZ
fesetenv(&fe);
>How-To-Repeat:
Compile the following test program on a NetBSD/amd64 7.0 system with
cc -O3 -ffast-math test_denormal.c -lm -o test_denormal
and run it:
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#define N 4 // Vector size
void test() {
int i;
float a[N], b[N];
for (i = 0; i < N; i++) {
a[i] = pow(10, -36 - i);
}
for (i = 0; i < N; i++) {
b[i] = sqrtf(a[i]);
}
for (i = 0; i < N; i++) {
printf("sqrt(%g) = %g\n", a[i], b[i]);
}
if (isinf(b[3])) {
printf("FAIL: square root of a small number returned infinity\n");
exit(1);
}
}
int main(int argc, char **argv) {
test();
printf("PASS\n");
return 0;
}
Compiled without -ffast-math, the output is correct:
sqrt(1e-36) = 1e-18
sqrt(1e-37) = 3.16228e-19
sqrt(1e-38) = 1e-19
sqrt(1e-39) = 3.16228e-20
PASS
Compiled with -ffast-math on a non-NetBSD system, the denormals are
flushed to zero as expected, and the results are now slightly less
accurate, but still within 1e-19 of the exact result:
sqrt(1e-36) = 1e-18
sqrt(1e-37) = 3.16228e-19
sqrt(0) = 0
sqrt(0) = 0
PASS
Compiled on NetBSD/amd64 with "-O3 -ffast-math", you get the following
output, where the last two square roots are not even close:
sqrt(1e-36) = 1e-18
sqrt(1e-37) = 3.16228e-19
sqrt(1e-38) = -inf
sqrt(1e-39) = -inf
FAIL: square root of a small number returned infinity
>Fix:
Home |
Main Index |
Thread Index |
Old Index