Subject: vmware suddenly crashing the system...
To: None <port-i386@netbsd.org>
From: Steve Bellovin <smb@research.att.com>
List: port-i386
Date: 07/31/2001 14:30:13
vmware has suddenly started hanging or crashing my machine. I had been
running 1.5.1b2; in a vain attempt to solve the problem, I upgraded
this morning to the latest kernel in the 1.5 branch, which identifies
itself is 1.5.2_ALPHA.
The symptom is that most user-level programs (including my window
manager) go non-responsive when vmware is trying to sync its redo file
(I use undoable virtual disks). ssh from another host hung, too, but I
could get a response from 'ntpq'. Sometimes -- but not always -- I've
seen TCP connection attempts from outside stay in SYN_SENT state (i.e.,
it never got an answer from an interrupt-level function on the hung
machine), but ping succeeded. I've *never* seen that happen.
At least once, the machine panicked, leaving behind the following
mesage on reboot:
Jul 31 10:56:45 berkshire savecore: reboot after panic: uvm_pagedeactivate: caller did not check wire count
Jul 31 10:56:45 berkshire savecore: no dump, not enough free space in /var/crash
I moved /var/crash to /usr, but (of course) I haven't seen that failure
since then.
The only substantive thing I changed between when vmware had been
working and when it started failing is that I removed a 128M SIMM. I
had been getting sig11 during gcc compilations, which (according to ma
ny folks) indicate memory errors. I've seen no such failures since
removing the SIMM -- which still leaves me with 256M -- but that could
mean that the bad spot is now somewhere in the kernel. (The IBM
diagnostics haven't found anything wrong...)
Any suggestions would be gratefully appreciated.
--Steve Bellovin, http://www.research.att.com/~smb