tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Panic / wedge from bridging tap and vlan



> Date: Tue, 4 Feb 2025 02:11:33 +0000 (UTC)
> From: John Klos <john%klos.com@localhost>
> 
> I may've written about this before, but in case I didn't, I just set up a 
> bridge between a tap interface which is used by tinc (pkgsrc/net/tinc) and 
> a vlan.
> 
> It worked, but the moment I tried to use it, the machine became 
> unresponsive. I couldn't enter the kernel debugger because of USB 
> keyboard, but I got some info after reboot:
> 
> ...
> Starting virecover.
> Checking for core dump...
> savecore: check_kmem:429: kum_read msgbuf: _kvm_kvatop(ffffa50855d24000)
> Feb 2 23:03:19 sage savecore: reboot after panic: lock error: Mutex: 
> nutex_vector_ente r.548: locking against nyself: lock Oxffffd152aab57080 
> cpu 0 lup 0xffffd152a9aeb480 savecore: reboot after panic: lock error: 
> Mutex: mutex_vector_enter ,548: locking agains t myself: lock 
> Oxffffd152aab57080 cpu 0 lup 0xffffd152a9aeb480
> savecore: system went doun at Sun Feb 2 22:40:05 2025 savecore: uriting 
> compressed core to /var/crash/netbsd.1.core.gz

Can you get a stack trace out of this?

gunzip -c </var/crash/netbsd.1.core.gz >/var/crash/netbsd.1.core
echo bt | crash -M /var/crash/netbsd.1.core

or maybe just see if dmesg shares the previous boot's output which
might contain the panic and stack trace.

> Either this shouldn't happen, which is to say that bridging to a vlan 
> should work, or this shouldn't be allowed, I'd think.

Obviously yes, this is a garden-variety unintended lock recursion
probably in tap(4) whose locking is hokey, but we need the stack trace
to track it down.

Do you have steps to reproduce?

Can you take all of that information -- the panic message, the stack
trace, the steps to reproduce -- and file a PR with it so it doesn't
get lost and we can track pullups and close it when done?


Home | Main Index | Thread Index | Old Index