Subject: bin/10095: systat vmstat displays invalid data on long-running 64-bit systems
To: None <gnats-bugs@gnats.netbsd.org>
From: None <mhitch@montana.edu>
List: netbsd-bugs
Date: 05/10/2000 20:18:13
>Number: 10095
>Category: bin
>Synopsis: systat vmstat display invalid data on long-running 64-bit systems
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed May 10 20:19:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Michael L. Hitch
>Release: NetBSD-current as of May 6, 2000 <NetBSD-current source date>
>Organization:
Montana State University
>Environment:
System: NetBSD alpha.msu.montana.edu 1.4W NetBSD 1.4W (PC164) #17: Tue Apr 11 18:36:05 MDT 2000 mhitch@alpha.msu.montana.edu:/usr/cvsroot/src/sys/arch/alpha/compile/PC164 alpha
>Description:
After a 64-bit system such as an Alpha has been running for a time
sufficient to cause some counters to exceed 31 bits, the systat vmstat
display begins displaying bogus data. This is caused by the use of
a 32 bit variable as a temporary when computing the change since the
previous display for a number of items (interrupt counters and time
in the cpu states for specific examples). The variable is named t,
and is defined as a time_t.
There is also another potential for loss in the copyinfo() procedure
caused by using sizeof(int) for computing the size of an array of
long items. It did not cause a problem on my AlphaPC 164 because the
number of items in the interrupt count array is sufficiently larger
than the highest active interrupt.
>How-To-Repeat:
Run an Alpha system for several weeks with high network and disk
activity. Wonder why the system shows a constant 512 interrupts
per second for both Ethernet interfaces, and a constant 50% system
time, and no disk transfers when it's very obvious the disk is
quite busy.
>Fix:
This patch will define a different local variable for t (the variable
used to calculate the value differences) as a long variable instead
of a time_t. The patch also uses sizeof(long) instead of sizeof(int)
in copyinfo() when copying the array of interrupt counts.
It might be better to use a different variable name than t in the
macros used to calculate the display values to remove confusion
with the static variable t.
--- /opt/src/usr.bin/systat/vmstat.c Sat Jan 22 05:46:43 2000
+++ ./vmstat.c Wed May 10 20:45:17 2000
@@ -349,6 +349,7 @@
int psiz, inttotal;
int i, l, c;
static int failcnt = 0;
+ long t;
if (state == TIME)
dkswap();
@@ -647,7 +648,7 @@
intrcnt = to->intrcnt;
*to = *from;
- memmove(to->intrcnt = intrcnt, from->intrcnt, nintr * sizeof (int));
+ memmove(to->intrcnt = intrcnt, from->intrcnt, nintr * sizeof (long));
}
static void
>Release-Note:
>Audit-Trail:
>Unformatted: