tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: netbsd-7 panic in rn_walknext via sysctl and rtsock



On Fri, Feb 8, 2019 at 11:10 PM Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:
>
> On Mon, 12 Nov 2018, Ryota Ozaki wrote:
> > On Thu, Nov 8, 2018 at 11:39 PM Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:
> >>
> >> With a kernel from 2018-11-01 netbsd-7 sources:
> >>
> >> (gdb) bt
> >> #0  0xffffffff8065d56f in cpu_reboot (howto=howto@entry=260,
> >>      bootstr=bootstr@entry=0x0)
> >>      at /usr/src/7.0/sys/arch/amd64/amd64/machdep.c:671
> >> #1  0xffffffff80894242 in vpanic (fmt=fmt@entry=0xffffffff80d50632 "trap",
> >>      ap=ap@entry=0xfffffe813a780a60) at /usr/src/7.0/sys/kern/subr_prf.c:340
> >> #2  0xffffffff808942fd in panic (fmt=fmt@entry=0xffffffff80d50632 "trap")
> >>      at /usr/src/7.0/sys/kern/subr_prf.c:256
> >> #3  0xffffffff808d7676 in trap (frame=0xfffffe813a780b80)
> >>      at /usr/src/7.0/sys/arch/amd64/amd64/trap.c:304
> >> #4  0xffffffff80100f26 in alltraps ()
> >> #5  0xffffffff807aece4 in rn_walknext (printer=0x0, arg=0x0,
> >>      rn=0x3a77b79e61ff6d35) at /usr/src/7.0/sys/net/radix.c:959
> >> #6  rn_walktree (h=<optimized out>,
> >>      f=f@entry=0xffffffff807f7d70 <rt_walktree_visitor>,
> >>      w=w@entry=0xfffffe813a780ca8) at /usr/src/7.0/sys/net/radix.c:992
> >> #7  0xffffffff807f7ea2 in rt_walktree (family=family@entry=2 '\002',
> >>      f=f@entry=0xffffffff807fe930 <sysctl_dumpentry>,
> >>      v=v@entry=0xfffffe813a780d08) at /usr/src/7.0/sys/net/rtbl.c:204
> >> #8  0xffffffff807fee7f in sysctl_rtable (name=0xfffffe813a780e5c,
> >>      namelen=<optimized out>, oldp=0x1bcad2000, oldlenp=0xfffffe813a780e48,
> >>      newp=<optimized out>, newlen=<optimized out>, oname=0xfffffe813a780e50,
> >>      l=0xfffffe8831a25240, rnode=0xfffffe813a5bf008)
> >>      at /usr/src/7.0/sys/net/rtsock.c:1417
> >> #9  0xffffffff806035ac in sysctl_dispatch (name=name@entry=0xfffffe813a780e50,
> >>      namelen=6, oldp=0x1bcad2000, oldlenp=oldlenp@entry=0xfffffe813a780e48,
> >>      newp=0x0, newlen=0, oname=oname@entry=0xfffffe813a780e50,
> >>      l=l@entry=0xfffffe8831a25240, rnode=0xfffffe813a5bf008, rnode@entry=0x0)
> >>      at /usr/src/7.0/sys/kern/kern_sysctl.c:451
> >> #10 0xffffffff80603724 in sys___sysctl (l=0xfffffe8831a25240,
> >>      uap=0xfffffe813a780f00, retval=<optimized out>)
> >>      at /usr/src/7.0/sys/kern/kern_sysctl.c:307
> >> #11 0xffffffff808af2ca in sy_call (rval=0xfffffe813a780eb8,
> >>      uap=0xfffffe813a780f00, l=0xfffffe8831a25240,
> >>      sy=0xffffffff81043580 <sysent+3232>)
> >>      at /usr/src/7.0/sys/sys/syscallvar.h:61
> >> #12 sy_invoke (code=202, rval=0xfffffe813a780eb8, uap=0xfffffe813a780f00,
> >>      l=0xfffffe8831a25240, sy=0xffffffff81043580 <sysent+3232>)
> >>      at /usr/src/7.0/sys/sys/syscallvar.h:85
> >> #13 syscall (frame=0xfffffe813a780f00)
> >>      at /usr/src/7.0/sys/arch/x86/x86/syscall.c:156
> >> #14 0xffffffff80100691 in Xsyscall ()
> >>
> >>
> >> #5  0xffffffff807aece4 in rn_walknext (printer=0x0, arg=0x0,
> >>      rn=0x3a77b79e61ff6d35) at /usr/src/7.0/sys/net/radix.c:959
> >> 959                     rn = rn->rn_l;
> >>
> >>
> >> #10 0xffffffff80603724 in sys___sysctl (l=0xfffffe8831a25240,
> >>      uap=0xfffffe813a780f00, retval=<optimized out>)
> >>      at /usr/src/7.0/sys/kern/kern_sysctl.c:307
> >> 307             error = sysctl_dispatch(&name[0], SCARG(uap, namelen),
> >> (gdb) print name[0]
> >> $3 = 4
> >>
> >> I have a core, any suggestions on what to get from gdb?
> >>
> >> --
> >> Stephen
> >>
> >
> > Perhaps there were parallel accesses to the routing table and sysctl
> > touched a corrupted entry.
> >
> > sysctl_rtable probably needs softnet_lock and/or KERNEL_LOCK.
> > Adding them around splsoftnet in sysctl_rtable would fix the panic.
>
> I've been running with the following patch on netbsd-7 for a few months
> with success. Is this applicable to HEAD? If so, should it be commmitted?
> If not, I'll try to work out a way to pull this up to netbsd-7 without a
> HEAD commit.

I'm sorry for late replying.

You can pull the diff up to nebsd-7 solely because HEAD needs the same
fix but it's better to fix it in a slightly different way for HEAD
(there are utility
macros for the locks in HEAD, but not in netbsd-7).

Thanks,
  ozaki-r


>
> Index: sys/net/rtsock.c
> ===================================================================
> RCS file: /cvsroot/src/sys/net/rtsock.c,v
> retrieving revision 1.163.2.1
> diff -u -r1.163.2.1 rtsock.c
> --- sys/net/rtsock.c    28 Nov 2018 16:30:06 -0000      1.163.2.1
> +++ sys/net/rtsock.c    8 Feb 2019 14:06:56 -0000
> @@ -1408,6 +1408,8 @@
>          w.w_needed = 0 - w.w_given;
>          w.w_where = where;
>
> +       mutex_enter(softnet_lock);
> +       KERNEL_LOCK(1, NULL);
>          s = splsoftnet();
>          switch (w.w_op) {
>
> @@ -1434,6 +1436,8 @@
>                  break;
>          }
>          splx(s);
> +       KERNEL_UNLOCK_ONE(NULL);
> +       mutex_exit(softnet_lock);
>
>          /* check to see if we couldn't allocate memory with NOWAIT */
>          if (error == ENOBUFS && w.w_tmem == 0 && w.w_tmemneeded)
>
> --
> Stephen
>


Home | Main Index | Thread Index | Old Index