toolchain/50940: cc -ffast-math fails to disable denormals on amd64

To: toolchain-manager%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost
Subject: toolchain/50940: cc -ffast-math fails to disable denormals on amd64
From: gson%gson.org@localhost (Andreas Gustafsson)
Date: Fri, 11 Mar 2016 14:45:00 +0000 (UTC)

>Number:         50940
>Category:       toolchain
>Synopsis:       cc -ffast-math fails to disable denormals on amd64
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    toolchain-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 11 14:45:00 +0000 2016
>Originator:     Andreas Gustafsson
>Release:        NetBSD 7.0
>Organization:

>Environment:
System: NetBSD 7.0
Architecture: x86_64
Machine: amd64
>Description:

When the -ffast-math option is enabled, gcc generates x86_64 floating
point code that assumes that the FTZ (Flush To Zero) and DAZ
(Denormals Are Zero) bits in the MXCSR register are set.

On at least some non-NetBSD platforms, -ffast-math also causes the
program to link in a runtime support module "crtfastmath.o" that sets
FTZ and DAZ, but this does not happen on NetBSD.  As a result, the
generated code that is relying on FTZ and DAZ being set can yield
incorrect results for denormal input.

The specific issue I ran into was calculating an array of square roots
of single-precision floats.  For this, gcc -O3 -ffast-math managed to
generate vectorized SSE code using the rsqrtps instruction followed by
a Newton-Raphson step to calculate four square roots in one go.  This
code returns incorrect results (-infinity rather than zero) for denormal
inputs when DAZ is not set.

I have not checked if NetBSD ports other than amd64 are also affected.

A work-around is to execute the following early in the program, before
doing floating-point math:

    fenv_t fe;
    fegetenv(&fe);
    fe.mxcsr |= 0x8040; // Set DAZ and FTZ
    fesetenv(&fe);

>How-To-Repeat:

Compile the following test program on a NetBSD/amd64 7.0 system with

   cc -O3 -ffast-math test_denormal.c -lm -o test_denormal

and run it:

#include <math.h>
#include <stdlib.h>
#include <stdio.h>

#define N 4 // Vector size

void test() {
    int i;
    float a[N], b[N];
    for (i = 0; i < N; i++) {
	a[i] = pow(10, -36 - i);
    }
    for (i = 0; i < N; i++) {
	b[i] = sqrtf(a[i]);
    }
    for (i = 0; i < N; i++) {
	printf("sqrt(%g) = %g\n", a[i], b[i]);
    }
    if (isinf(b[3])) {
	printf("FAIL: square root of a small number returned infinity\n");
	exit(1);
    }
}

int main(int argc, char **argv) {
    test();
    printf("PASS\n");
    return 0;
}

Compiled without -ffast-math, the output is correct:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(1e-38) = 1e-19
  sqrt(1e-39) = 3.16228e-20
  PASS

Compiled with -ffast-math on a non-NetBSD system, the denormals are
flushed to zero as expected, and the results are now slightly less
accurate, but still within 1e-19 of the exact result:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(0) = 0
  sqrt(0) = 0
  PASS

Compiled on NetBSD/amd64 with "-O3 -ffast-math", you get the following
output, where the last two square roots are not even close:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(1e-38) = -inf
  sqrt(1e-39) = -inf
  FAIL: square root of a small number returned infinity

>Fix:

Follow-Ups:
- Re: toolchain/50940: cc -ffast-math fails to disable denormals on amd64
  - From: Joerg Sonnenberger

Prev by Date: Re: lib/50936: security/openssh segfaults on NetBSD-current
Next by Date: Re: lib/50698: hypotf() of small numbers yields infinity
Previous by Thread: toolchain/50939: Bug in GCC optionization causing i386 net-snmpd to crash
Next by Thread: Re: toolchain/50940: cc -ffast-math fails to disable denormals on amd64
Indexes:

Home | Main Index | Thread Index | Old Index