tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
relevant panic() when combining lagg(4), vlan(4) and bridge(4)
Happy new year everyone!
In case that rings a bell to someone -- I found nothing in PR.
I recently updated one of my host to 10.1 (from 10.0) and took this
opportunity to move from agr(4) to lagg(4). However this change
manifested in quite rapid panic()s after boot due to locking error:
[ 26.1386375] current lwp : 8xfffffe747738d488
[ 26.1386375] owner field : 8xfffffe747738d480 wait/spin: 8/8
[ 26.1386375] panic: lock error: Mutex: mutex_vector_enter,548: locking
against myself: lock 8xfffffe7478379888 cpu 8 lwp 8xfffffe747730d480
[ 26.1386375] cpuO: Begin traceback...
[ 26.1386375] vpanic() at netbsd:upanic+8x183
[ 26.1486372] panic() at netbsd:panic+8x3c
[ 26.1586367] lockdebug_abort() at netbsd: lockdebug_abort+8x114
[ 26.1586367] mutex_vector_enter() at netbsd:mutex_uector_enter+8x3Zb
[ 26.1686365] bridge_input() at netbsd:bridge_input+8x946
[ 26.1786362] vlan_input() at netbsd:ulan_input+8x143
[ 26.1786362] ether_input() at netbsd:ether_input+8x4c2
[ 26.1886361] bridge_input() at netbsd:bridge_input+8xal8
[ 26.1986358] lagg_input_ethernet() at netbsd: lagg_input_ethernet+8xZab
[ 26.2886358] if_percpuq_softint() at netbsd:if_percpuq_softint+8x8d
[ 26.2886358] softint_dispatch() at netbsd:softint_dispatch+8x95
[ 26.2186353] cpu0: End traceback...
[ 26.2186353] dumping to dev 18,17 (offset=8, size=8359657):
[ 26.2186353] dump device bad
[ 26.2186353] rebooting...
I am not knowledgeable enough in the netstack to figure out yet what
locking mistake is at play, but FWIW it seems to be some difficulties
between lagg(4), bridge(4) and vlan(4).
The configuration looks like so:
- lagg0 as a agregate of two PHYs (wm0 + wm1). It is a trunk where
two "networks" are used (native and tagged ID 16);
- lagg0 is part of bridge0 with many tap(4) to provide native
connectivity to VMs running on the host;
- a vlan(4) (vlan16) is bound to lagg0;
- vlan16 is attached to a separate bridge16, where a single tap is
found to provide connectivity to that VLAN specifically for one VM.
# cat /etc/ifconfig.lagg0
create
!ifconfig wm0 up
!ifconfig wm1 up
laggproto lacp laggport wm0 laggport wm1
inet 192.168.1.2/24
inet 192.168.1.3/24 alias
# VM #0 (native network)
# cat ifconfig.tap0
create
!brconfig bridge0 add $int
up
# VM #7 (isolated network)
# cat ifconfig.tap7
create
!brconfig bridge16 add $int
up
# cat ifconfig.vlan16
create
vlan 16 vlanif lagg0
!brconfig bridge16 add $int
!brconfig bridge16 -learn $int
up
As it is a production host I cannot reproduce it "at will", but looking
at its configuration I think it can be triggered with ease on a test bed
(incoming).
Thanks,
--
jym@
Home |
Main Index |
Thread Index |
Old Index