Subject: Re: Unpredictable reboots.
To: None <port-i386@NetBSD.org>
From: Peter Seebach <seebs@plethora.net>
List: port-i386
Date: 03/05/2005 08:24:48
In message <d0c1op$oci$1@colwyn.zhadum.de>, Matthias Scheler writes:
>In article <200503041442.j24EgmgM000055@guild.plethora.net>,
> seebs@plethora.net (Peter Seebach) writes:
>> Anyone else seeing anything like this?
>
>No:
Interesting. I put in a new power supply, just for luck, and went back to a
DIAGNOSTIC kernel. This morning, it was wedged at KASSERT(to_ticks >= 0).
Backtrace said it was executing sendmail, and the backtrace was through
tcp_output.
I then saw the exact same panic FIVE more times. Always within about ten
seconds of starting sendmail.
So, I did the obvious thing; made a new kernel which prints to_ticks and sets
it to zero.
It's hit that 31 times in 11 minutes.
I think it's safe to say I can reproduce this. What this doesn't leave me
with is any clue how to fix it, or get better debugging info. But at least I
have the machine running again in the mean time. Of course, as you'd expect,
this is a production server, and I can't make it happen on anything else.
5 to_ticks: -150
6 to_ticks: -200
6 to_ticks: -250
9 to_ticks: -300
3 to_ticks: -350
Not sure if the distribution of times means anything. I suppose next up would
be adding stack traces and adding debugging code to whatever's calling this.
And yes, I applied the patches indicated in the PR (29134).
-s