NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
port-arm/52603: arm(v7?) vfp register corruption
>Number: 52603
>Category: port-arm
>Synopsis: arm(v7?) vfp register corruption
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-arm-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 08 16:20:00 +0000 2017
>Originator: Manuel Bouyer
>Release: NetBSD 8.0_BETA
>Organization:
>Environment:
System: NetBSD chartplotter 8.0_BETA NetBSD 8.0_BETA (CHARTPLOTTER) #1: Sat Sep 9 13:55:40 CEST 2017 bouyer%bop.soc.lip6.fr@localhost:/dsk/l1/misc/bouyer/tmp/earmv7hf/obj/dsk/l1/misc/bouyer/netbsd-8/src/sys/arch/evbarm/compile/CHARTPLOTTER evbarm
Architecture: earmv7hf
Machine: evbarm
>Description:
running pkgsrc/geography/opencpn on a olimex lime2, and a cubieboard2
(both Allinner A20), I got evidence of occasional
floating-point register corruption (a printf at a strategic point
shows that a variable computed from other values a few lines before
has the wrong value).
I then tried this test program:
cat > fptest.c << EOF
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
#include <sys/time.h>
#define NREGS 32
void do_test(int);
void do_wait(int);
double foo = 0;
void
do_wait(int id) {
struct timeval start, now;
if (id != 0)
return;
// sleep(1);
gettimeofday(&start, NULL);
while (1) {
if (id == 0) {
foo = foo * 0.1;
} else {
int i;
for (i = 0; i < 1000000000; i++) {
if (id == 0)
;
}
}
gettimeofday(&now, NULL);
if (now.tv_sec - start.tv_sec > 1)
return;
}
}
void
do_test(int id)
{
double *src = malloc(sizeof(double) * NREGS);
double *dst = malloc(sizeof(double) * NREGS);
int i;
printf("start job %d for %d registers\n", id, NREGS);
for (i = 0; i < NREGS; i++) {
src[i] = id * 100 + i * 1.1;
}
while (1) {
foo = foo * 0.1;
__asm __volatile("vldmia %0, {d0-d15}" :: "r" (src) : "memory");
#if NREGS > 16
__asm __volatile("vldmia %0, {d16-d31}" :: "r" (&src[16]) : "memory");
#endif
memset(dst, 0, sizeof(double) * NREGS);
do_wait(id);
__asm __volatile("vstmia %0, {d0-d15}" :: "r" (dst) : "memory");
#if NREGS > 16
__asm __volatile("vstmia\t%0, {d16-d31}" :: "r" (&dst[16]) : "memory");
#endif
if (id == 0)
continue;
for (i = 0; i < NREGS; i++) {
double v = id * 100 + i * 1.1;
if (dst[i] != v) {
printf("%d: %lf %lf %lf\n", i, src[i], dst[i], v);
}
}
}
}
int
main(int argc, const char *argv[])
{
int n;
int i;
if (argc != 2) {
errx(1, "usage: fptest <n>");
}
n = atoi(argv[1]);
for (i = 1; i < n; i++) {
switch(fork()) {
case -1:
err(1, "fork");
case 0:
do_test(i);
exit(0);
default:
break;
}
}
do_test(0);
exit(0);
}
EOF
compile with
gcc -g -mfpu=neon-vfpv4 -o fptest fptest.c
running
./fptest 2
in parallel with opencpn, after a few hours I got
0: 100.000000 100.000000 28.000000
and then, less than a day later:
0: 100.000000 11.672723 100.000000
1: 101.100000 16.850291 101.100000
2: 102.200000 6.424029 102.200000
3: 103.300000 16.679222 103.300000
4: 104.400000 255.000000 104.400000
5: 105.500000 255.000000 105.500000
6: 106.600000 255.000000 106.600000
7: 107.700000 0.002048 107.700000
8: 108.800000 0.087582 108.800000
9: 109.900000 0.500000 109.900000
so we have rare but obvious vfp register corruption.
I suspect it's related an to interrupt occuring at the wrong
time, but couldn't track it down more than that.
>How-To-Repeat:
see above. It you're not running opencpn, you may need to run
other heavy FP application, or start more than 2 process when
invoking the test program.
>Fix:
please ...
Home |
Main Index |
Thread Index |
Old Index