Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: panic: pmap_tlb_pendcount > 0 failed
On Thu, Mar 28, 2013 at 09:00:45AM +0100, Thomas Klausner wrote:
> On Thu, Mar 28, 2013 at 12:29:40AM +0100, Thomas Klausner wrote:
> > Hi!
> >
> > I've just run a GENERIC-6.99.18/amd64 kernel from today on a newly
> > acquired machine, and see:
> >
> > ...
> > acpicpu11 at cpu11: ACPI CPU
> > panic: kernel diagnostic assertion "pmap_tlb_pendcount > 0" failed: file
> > "../../../../arch/x86/x86/pmap_tlb.c", line 451
> > fatal breakpoint trap in supervisor mode
> > trap type 1 code 0 rip ffffffff802546ed cs 8 rflags 46 cr2 0 ilevel 0 rsp
> > fffffe813a6369e0
> > curlwp 0xfffffe887556bb20 pid 0.71 lowest kstrack 0xfffffe813a633000
> > Stopped in pid 0.71 (system) at netbsd:breakpoint+0x5: leave
> > db{10}> bt
> > breakpoint() at netbsd:breakpoint+0x5
> > vpanic() at netbsd:vpanic+0x136
> > kern_assert() at netbsd:kern_assert+0x48
> > pmap_tlb_intr() at netbsd:pmap_tlb_intr+0xf4
> > DDB lost frame for netbsd:Xinter_lapic_tlb+0x98, trying 0xfffffe813636aa0
> > Xintr_lapic_tlb() at netbsd:Xinter_lapic_tlb+0x98
> > --- interrupt ---
> > 246:
> > db{10}>
> >
> > dmesg is too long to copy by hand, but available on the db prompt.
> > It's a Supermicro X9SRi with a Xeon E5-1650@3.20GHz.
>
> PR 47437 by Taylor R Campbell might be related, he writes:
>
> Sometimes when I boot a many-core machine, during autoconf I
> get a panic after the ACPI CPU devices are configured. I've
> seen the panic several times; last night I caught it on the
> serial console for the first time with ddb and grabbed a stack
> trace. I believe it always happens after all the acpicpuN
> devices are attached, but I'm not sure.
>
> so it's the same place in the boot process; but his panic is
>
> panic: kernel diagnostic assertion "pmap_tlb_pendcount < ncpu" failed: file
> "/home/riastradh/netbsd/current/src/sys/arch/x86/x86/pmap_tlb.c", line 434
I've looked at my panic a bit more:
pmap_tlb_pendcount is a static volatile u_int, but the kassert triggers:
KASSERT(pmap_tlb_pendcount > 0);
So I assume it must be zero at that time.
The kassert is in sys/arch/x86/x86/pmap_tlb.c in the function
pmap_tlb_intr(). I have found only one caller for it,
sys/arch/i386/i386/vector.S:
/*
* TLB shootdown handler.
*/
IDTVEC(intr_lapic_tlb)
pushl $0
pushl $T_ASTFLT
INTRENTRY
movl $0, _C_LABEL(local_apic)+LAPIC_EOI
call _C_LABEL(pmap_tlb_intr)
INTRFASTEXIT
IDTVEC_END(intr_lapic_tlb)
On the other hand, there is only one place where it is increased, in
sys/arch/x86/x86/pmap_tlb.c again:
...
local = kcpuset_isset(target, cid) ? 1 : 0;
rcpucount = kcpuset_countset(target) - local;
#ifdef MULTIPROCESSOR
if (rcpucount) {
...
while (atomic_cas_uint(&pmap_tlb_pendcount, 0, rcpucount)) {
splx(s);
count = SPINLOCK_BACKOFF_MIN;
while (pmap_tlb_pendcount) {
KASSERT(pmap_tlb_pendcount < ncpu);
SPINLOCK_BACKOFF(count);
}
s = splvm();
/* An interrupt might have done it for us. */
if (tp->tp_count == 0) {
splx(s);
return;
}
}
...
I don't know enough about this to dig further here.
Thomas
Home |
Main Index |
Thread Index |
Old Index