Re: RC3 LOCKDEBUG panic on 6-way E3k.

To: Martin Husemann <martin%duskware.de@localhost>
Subject: Re: RC3 LOCKDEBUG panic on 6-way E3k.
From: Anders Lindgren <ali%df.lth.se@localhost>
Date: Wed, 22 Apr 2009 12:39:08 +0200 (CEST)

On Sun, 22 Mar 2009, Martin Husemann wrote:

On Sun, Mar 22, 2009 at 03:27:40PM +0100, Anders Lindgren wrote:

  I got an RC3 DIAGNOSTICS+DEBUG+LOCKDEBUG kernel running with changes
suggested by Martin. Foor good measure, I eliminated RAIDframe from the
picture by booting a second install from a different disk and started a
build.sh -j8 release-build on it to kill it. Rather than deadlock hard
within 10 minutes, it now survived 37 minutes -- but then it panicked! But
now I have ddb!


It's running out of mmu contexts on one of the cpus - and something goes
wrong in the code supposed to recover from that (not realy a heavily tested
code path). I'll have to read the code again and try to see if can reproduce
it with aritificially limited number of contexts quicker on a local machine.

Are we talking about ASI leakage here? Dunno about USII, but USI has(iirc) 4k ASIs.. I haven't looked into how they're handled, but if they'rejust used round-robin (can't see a reason to do otherwise?), it should beimpossible to run out of them unless there are more than 4k concurrentprocesses?

I see your kmutex_init patch made it into RC4, and removed my localmodification. However, something fishy appears to have sneaked into RC4too; I built a new RC4 LOCKDEBUG kernel, but it never gets to the ASIleakage bug -- it ddb:s trying to read address 0x40 in an openfirmware()call from OF_read, coming from pcons_poll. This happens at the "filesystemtype (generic)?" question in the boot -a dialogue, right after answeringthe root- and dump-device questions.

On a different note: I'm looking into getting a remotely controlled relayto the power cord of this E3k box so I can remotely power cycle it. Doesanyone know if this could cause damage to the box (as opposed to powercycling it with the key)? It's no worse than a regular power outage, butI'm not sure how healthy that really is. Manuals tend to not recommend it.


/ali:)

References:
- 5.0_RC2 sparc64 LOCKDEBUG kernel(s) non-bootable?
  - From: Rafal Boni
- Re: 5.0_RC2 sparc64 LOCKDEBUG kernel(s) non-bootable?
  - From: Martin Husemann
- RC3 LOCKDEBUG panic on 6-way E3k.
  - From: Anders Lindgren
- Re: RC3 LOCKDEBUG panic on 6-way E3k.
  - From: Martin Husemann

Prev by Date: Re: Keyboard fails on Sparc64 install of NetBSD 4.0.1
Next by Date: [PATCH] ofwboot bootp/bootparams mixup
Previous by Thread: Re: RC3 LOCKDEBUG panic on 6-way E3k.
Next by Thread: ALTQ on sparc64?
Indexes:

Home | Main Index | Thread Index | Old Index