Subject: Interesting panics on NetBSD/sparc 2.0 MP kernel.
To: None <port-sparc@netbsd.org>
From: Eric Schnoebelen <eric@cirr.com>
List: port-sparc
Date: 08/12/2004 11:01:57
Greetings all.
I'm trying to get my UUCP and mailing list server upgraded to
2.0_BETA (so I can make use of a pair of Ross 125's in the box.)
I'm currently running GENERIC.MP on a single processor system (to
shake out bugs and the like)
I'm getting interesting panics each night, when the nightly report
runs. In particular, when find starts walking the filesystem , I
get a huge number of the following messages (on the order of 160
lines):
sd0(esp0:0:1:0): unable to allocate ecb
sd0(esp0:0:1:0): unable to allocate scsipi_xfer
sd0: not queued, error 12
followed by about 1100 lines of the following:
sd0(esp0:0:1:0): adapter resource shortage
sd0(esp0:0:1:0): unable to allocate ecb
and eventually panicing with the following:
dev = 0x700, block = 40500, fs = /
panic: blkfree: freeing free frag
syncing disks... panic: cpu0: stuck on lock@f0f2d960
Frame pointer is at 0xf030ce00
Call traceback:
pc = 0xf0282640 args = (0x1, 0x5, 0x0, 0x0, 0xf030cf20, 0x1, 0xf030c
e68) fp = 0xf030ce68
pc = 0xf01a89b0 args = (0x104, 0x0, 0x126e242e, 0x35eb, 0xffff, 0x15
5830, 0xf 030ced8) fp = 0xf030ced8
pc = 0xf000b050 args = (0xf000b058, 0x0, 0xf0f2d960, 0x1e8000e1, 0xf
0369000, 0x104, 0xf030cf40) fp = 0xf030cf40
pc = 0xf01cac64 args = (0xf0f2d960, 0xff, 0xffffffff, 0xa1a3b, 0xda,
0x2fef, 0xf030cfa0) fp = 0xf030cfa0
pc = 0xf0262acc args = (0xf0f2d958, 0x27b4fd, 0x1000000, 0xf026bb30,
0xf85, 0x21009, 0xf030d008) fp = 0xf030d008
dumping to dev 7,1 offset 2050090
dump Async registers (mid 8): afsr=0<AFA=0>; afva=0x00
cpu0: NMI: system interrupts: 100c0000<VME=0,SBUS=0,SC,T,M>
memory error:
EFSR: 10002<DW=0,SYNDROME=0,ME>
MBus transaction: fc64d30<VAH=0,TYPE=3,SIZE=5,C,VA=19,S,MID=0>
address: 0x0f028e000
module location: ?
Type 'go' to resume
Now, I'm having a bit of a problem believing it's a memory error,
as the system was running NetBSD 1.6ZG (GENERIC) without a hiccup
for several weeks.
As I said, this machine is my mailing list and UUCP server. It's
got mimedefang configured and running, and mimedefang makes use of
clamd and spamassasssin (installed from pkgsrc). And the mailing
list manager is a hacked version of majordomo, so it's running perl
quite a bit.
I'm going to back down to the GENERIC kernel, but I want to help
get GENERIC.MP fixed too. What else would be useful (and how
should I go about getting it? my knowledge of obp is limited.)
I've placed the console log for the system from Monday night at
ftp://ftp.cirr.com/pub/NetBSD/crash/ihnp4.log-20040809. I'm
sure more will show up shortly.. :-(
Thanks,
Eric
--
Eric Schnoebelen eric@cirr.com http://www.cirr.com
There is this special biologist word we use for 'stable'.
It is 'dead'. -- Jack Cohen