Re: UDP_ENCAP_ESPINUDP_NON_IKE

To: Ryota Ozaki <ozaki-r%iij.ad.jp@localhost>
Subject: Re: UDP_ENCAP_ESPINUDP_NON_IKE
From: Chuck Zmudzinski <frchuckz%gmail.com@localhost>
Date: Wed, 30 May 2018 17:46:11 -0400

Ryota,

I discovered this crash on netbsd current kernels is triggered by ararely encountered routing configuration for IPv6.

Here are the results of decoding the backtrace when it crashes(if_mcast_op is there):


address                            function            File:line number

?() at ffffffff80205845        breakpoint        ??:?
?() at ffffffff804ee4e8        vpanic src/sys/kern/subr_prf.c:343
?() at ffffffff805ec155        kern_assert        ??:?

?() at ffffffff8057ef93 if_mcast_op src/sys/net/if.c:3595(discriminator 1)?() at ffffffff802ce97c in6_addmulti src/sys/netinet6/mld6.c:747(discriminator 3)

?() at ffffffff802d333f        nd6_rtrequest src/sys/netinet6/nd6.c:1585
?() at ffffffff805af696        rtrequest1 src/sys/net/route.c:1292
?() at ffffffff805b2aac        route_output src/sys/net/rtsock.c:759
?() at ffffffff805b0c8a        route_send src/sys/net/rtsock.c:473
?() at ffffffff80520989        sosend src/sys/kern/uipc_socket.c:1075
?() at ffffffff80505e28        soo_write src/sys/kern/sys_socket.c:122
?() at ffffffff804fa433        dofilewrite src/sys/kern/sys_generic.c:350
?() at ffffffff804fa539        sys_write src/sys/kern/sys_generic.c:320
?() at ffffffff8020f2bc        sy_call src/sys/sys/syscallvar.h:66

After further testing I discovered the crash does not occur unless IPv6routing to the VPN client is configured a certain way.

After looking at this decoding of the backtrace which involves routingand IPv6, I learned the crash is triggered by the -proxy modifier to theroute command I was using in the ipv6-up script. Keep in mind I am usinga NetBSD 7 userland and the NetBSD 7 version of the route command. I donot know if current's version of the route command can also use the-proxy modifier.


More details:

A while ago I discovered IPv6 connectivity to the VPN client requiresthat a route be added to the peer in the ipv6-up script of pppd, whichis called when ppp0 comes up after phase1 and phase2 are established ifthe +ipv6 option is set in /etc/ppp/options. So I included this line inmy ipv6-up script (In ipv6-up, $4 is the local IPv6 address on the ppplink, $5 is the remote IPv6 address on the ppp link, and $1 is the pppinterface name):


/sbin/route add -inet6 $5%$1 $4%$1 -interface -proxy

This provided connectivity between the peers on the ppp link. I addedthe -proxy modifier hoping to get the VPN client appear to be on thelink-local ethernet network (just as pppd's proxyarp option does this inIPv4, proxy ndp theoretically can do this in IPv6). Although the -proxymodifier to the route command did not work to provide proxy ndp for IPv6on NetBSD, nor did using the ndp proxy command, it did not cause asystem crash on NetBSD 7 or 8 kernels, but this -proxy modifier is whattriggers the crash on NetBSD current kernels. I did find a solution forproxy ndp on NetBSD 7, but it required a patch to the NetBSD 7 kerneland use of the -proxy modifier in the route command.


When I do this instead in ipv6-up I do not see a crash:

/sbin/route add -inet6 $5%$1 $4%$1 -interface

Without the -proxy modifier to the route command, there is no crash andIPv4 connectivity for the VPN client works fine using the proxyarpoption in pppd. For IPv6, I only have connectivity on the link-local ppplink, as expected when only using link-local addresses without proxy ndp.

According to route's man page, the -proxy modifier sets the RTF_ANNOUNCEflag, and as far as I can tell from the web interface for route's manpage -proxy is still valid for NetBSD 8.0, although maybe it is notactually available in NetBSD current now, in which case this crash wouldnever be seen in ordinary systems using current's route command. Butusing NetBSD 7's route command with the -proxy modifier with a currentkernel, you will see this crash.


Chuck


On 05/29/2018 09:26 PM, Ryota Ozaki wrote:

On Wed, May 30, 2018 at 7:02 AM Chuck Zmudzinski <frchuckz%gmail.com@localhost> wrote:

Ryota,

Here is what I am getting with the crash. I do not know how to decode
it.

Please do
   addr2line -f -e <kernel_binary> <address>
for each address.

Or
   objdump -d  <kernel_binary>
and search functions containing each address from the output by hand.

Or if you can do, build a kernel with 'makeoptions    DEBUG="-g"'
and use it, then you can get a backtrace with symbols on a panic.

Thanks,
   ozaki-r

I type bt and just get a bunch of hex numbers that I do not know how
to interpret. I try sync and get a messages that dumping to dev 142,1
(offset=6291455, size=0): not possible. After reboot, there is no core
dump in /var/crash. Maybe it is somewhere else. I checked that I do have
a dump device configured and I think I am still using the default values
for savecore. What else can I try to decode this? I tried using a
separate larger partition for /var/crash but that didn't make any
difference.

Chuck

Here is the output from bt and sync from the db prompt:

db{1}> bt
?() at ffffffff80205845
?() at ffffffff804ee4e8
?() at ffffffff805ec155
?() at ffffffff8057ef93
?() at ffffffff802ce97c
?() at ffffffff802d333f
?() at ffffffff805af696
?() at ffffffff805b2aac
?() at ffffffff805b0c8a
?() at ffffffff80520989
?() at ffffffff80505e28
?() at ffffffff804fa433
?() at ffffffff804fa539
?() at ffffffff8020f2bc
db{1}> sync

[ 1634.8391410] dumping to dev 142,1 (offset=6291455, size=0): not possible
[ 1634.8391410] rebooting...


On 05/29/2018 04:42 AM, Ryota Ozaki wrote:

On Fri, May 25, 2018 at 5:20 AM Maxime Villard <max%m00nbsd.net@localhost> wrote:

Le 24/05/2018 à 21:13, Chuck Zmudzinski a écrit :

Well, the crash is repeatable on the one week old daily snapshot current
kernel. Again, here is the current kernel I am using:

NetBSD 8.99.17 (XEN3_DOMU) #0: Wed May 16 21:54:38 UTC 2018
mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/xen/compile/XEN3_DOMU

What is happening is ... crazy.

With the current kernel, when the remote client connects, we get caught

in

an endless loop of creating ipsec security associations. The log shows

phase1

is created, then the phase2 associations, then we respond to negotiate

a new

phase1 and two new phase 2's, and I think this loop just continued

until we

ran out of memory. The windows client actually thought we were

connected and

showed it was connected in the network control panel, but the racoon log
never reported that a ppp interface was up. When you look at the

attached

snippets from the logs, I bet you will agree that many ppp interfaces

and

ipsec SAs were created and when we finally ran out of memory to create
another one, we crashed. I say this because the trace indicated the

crash

occurred at this branch. [1]. From the console at the start of the crash
report, I got this:

[ 334.5292103] panic: kernel diagnostic assertion "IFNET_LOCKED(ifp)"

failed: file "/usr/src/sys/net/if.c", line 3595

I don't understand line 3595 because if.c only has 661 lines, unless

there

was a mistake in how I copied it from the log.

You're looking at the wrong revision of if.c, yours seems to be [1].
The main issue here is that we reach this place with ifp unlocked. It's
probably not related to the system running out of memory.
That several entries get created in a loop, appears to be a separate

problem.

I know that several changes were made in netbsd-current for MPification.

It

may be that you exercise a particular condition that breaks an assumption
somewhere.
Ryota, Kengo, could you have a look?

I'm sorry I've looked the mail now.

Chuck, could you decode the backtrace of the panic? In this case the path
to the assertion (probably in if_mcast_op) is important.

Thanks,
     ozaki-r

Thanks,
Maxime
[1] https://nxr.netbsd.org/xref/src/sys/net/if.c?r=1.423#3595

Follow-Ups:
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Ryota Ozaki

References:
- UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Maxime Villard
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Ryota Ozaki
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Chuck Zmudzinski
- Re: UDP_ENCAP_ESPINUDP_NON_IKE
  - From: Ryota Ozaki

Prev by Date: Re: UDP_ENCAP_ESPINUDP_NON_IKE
Next by Date: Re: UDP_ENCAP_ESPINUDP_NON_IKE
Previous by Thread: Re: UDP_ENCAP_ESPINUDP_NON_IKE
Next by Thread: Re: UDP_ENCAP_ESPINUDP_NON_IKE
Indexes:

Home | Main Index | Thread Index | Old Index