NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Random lockups on an email server - possibly kern/50168
I have a server farm at my small ISP running NetBSD 7.0 and pf. All
the servers seem to be rock solid except the email server which has
random lockups. The system is still running as it responds to pings
and in fact if I am running screen I can switch between the different
screens but none of them will run anything and even a simple carriage
return will not display a new prompt.
It sounds like kern/50168 (Frequent lockups and panics with NetBSD
7/amd64, may be ipfilter-related) but I run pf, not ipf.
I have a little script that capture memory usage every minute and
stores it in a log. It writes the time followed by MemTotal, MemFree,
MemShared, SwapTotal, SwapFree, Cached and Buffers from /proc/meminfo.
Here's what it looked like when it hung and was rebooted.
Wed Mar 16 13:39:00 2016 31806 721 0 32787 32787 27565 25744
Wed Mar 16 13:40:00 2016 31806 718 0 32787 32787 27568 25744
Wed Mar 16 13:41:00 2016 31806 739 0 32787 32787 27549 25746
Wed Mar 16 13:42:00 2016 31806 733 0 32787 32787 27555 25748
Wed Mar 16 13:43:00 2016 31806 763 0 32787 32787 27528 25754
Wed Mar 16 13:44:00 2016 31806 720 0 32787 32787 27568 25754
Wed Mar 16 13:45:00 2016 31806 696 0 32787 32787 27591 25756
Wed Mar 16 13:46:00 2016 31806 718 0 32787 32787 27569 25755
Wed Mar 16 13:47:00 2016 31806 721 0 32787 32787 27566 25752
Wed Mar 16 13:48:01 2016 31806 736 0 32787 32787 27552 25756
Wed Mar 16 13:49:00 2016 31806 794 0 32787 32787 27497 25756
Wed Mar 16 13:50:00 2016 31806 819 0 32787 32787 27471 25755
Wed Mar 16 13:51:00 2016 31806 834 0 32787 32787 27457 25754
Wed Mar 16 13:52:00 2016 31806 830 0 32787 32787 27461 25754
Wed Mar 16 13:53:00 2016 31806 836 0 32787 32787 27456 25754
Wed Mar 16 13:54:00 2016 31806 842 0 32787 32787 27450 25755
Wed Mar 16 13:55:01 2016 31806 827 0 32787 32787 27465 25754
Wed Mar 16 13:56:00 2016 31806 66 0 32787 32787 28227 26540
Wed Mar 16 13:57:00 2016 31806 83 0 32787 32787 28207 26476
Wed Mar 16 13:58:00 2016 31806 48 0 32787 32787 28243 26479
Wed Mar 16 13:59:00 2016 31806 75 0 32787 32787 28215 26480
Wed Mar 16 14:36:01 2016 31806 31067 0 32787 32787 475 98
Wed Mar 16 14:37:00 2016 31806 30733 0 32787 32787 745 135
Wed Mar 16 14:38:00 2016 31806 30644 0 32787 32787 821 163
Wed Mar 16 14:39:00 2016 31806 30542 0 32787 32787 915 187
Wed Mar 16 14:40:00 2016 31806 30450 0 32787 32787 993 211
I could turn off pf but it could be weeks before a hang might happen.
I am considering rebooting on a regular basis (early Sunday morning is
what I had in mind) to see if that makes it more reliable but I have no
indication that this is uptime related.
I also have a "top -osize" running in one of the screens. Since I can
still switch screens I am hoping that that might show me the culprit if
it is a runaway process.
Can anyone suggest any other avenues to investigate?
--
D'Arcy J.M. Cain <darcy%NetBSD.org@localhost>
http://www.NetBSD.org/ IM:darcy%Vex.Net@localhost
Home |
Main Index |
Thread Index |
Old Index