Re: Trap 0x34 panic on 5.1 on Blade 100

To: George Harvey <fr30%dial.pipex.com@localhost>
Subject: Re: Trap 0x34 panic on 5.1 on Blade 100
From: Eduardo Horvath <eeh%NetBSD.org@localhost>
Date: Thu, 12 May 2011 18:15:30 +0000 (UTC)

On Thu, 12 May 2011, George Harvey wrote:

> On Sat, 7 May 2011 23:45:52 +0100
> George Harvey <fr30%dial.pipex.com@localhost> wrote:
> 
> > On Sat, 7 May 2011 09:18:26 +0200
> > Martin Husemann <martin%duskware.de@localhost> wrote:
> > 
> > > On Sat, May 07, 2011 at 12:13:02AM +0100, George Harvey wrote:
> > > > Are there any kernel debug options I could set to get a more readable
> > > > traceback?
> > > 
> > > If this is not a production server, you probably want ddb.onpanic = 1
> > > in /etc/sysctl.conf.
> > 
> > Looks like it could be the gem driver, I switched to a 3Com Ethernet
> > card and it stopped crashing.
> 
> After further testing, it appears that I only get panics when using the
> on-board gem interface with a 100Mb half-duplex connection.
> Specifically, when connected to a 3Com SuperStack II Dual Speed Hub 500.
> With a full-duplex switch connection, or even with an old 10Mb hub, I
> don't get any panics. The following backtrace is from a panic caused
> by starting xterm over ssh. FTP and NFS also produce similar panics:
> 
> blade100# trap type 0x34: cpu 0, pc=137b108 npc=137b10c 
> pstate=44800006<PRIV,IE> kernel trap 34: mem address not aligned
> Stopped in pid 451.1 (sshd) at  netbsd:m_xhalf+0x8:
>  ld [%o0 + 0 x20], %g2
> db> bt
> bpf_mtap(2ed0e00, c78c0f0, 0, 800, 2, 0) at netbsd:bpf_mtap+0xd4
> gem_rint(c78c000, 80000000, 17d8, 1ff00400000, 4, 4000) at
> netbsd:gem_rint+0x2cc

Hm.  Are you using the packer filter?  Looks like it's not accessing 
unaligned data properly.  Lessee... best to get a full disassembly of the 
function, but assuming %o0 didn't change the signature is 
m_xhalf(const struct mbuf *m, uint32_t k, int *err)
so it's trying to load something 32-bytes into the mbuf.

Here's your mbuf header:

struct m_hdr {
        struct  mbuf *mh_next;          /* next buffer in chain */
        struct  mbuf *mh_nextpkt;       /* next chain in queue/record */
        char   *mh_data;                /* location of data */
        struct  mowner *mh_owner;       /* mbuf owner */
        int     mh_len;                 /* amount of data in this mbuf */
        int     mh_flags;               /* flags; see below */
        paddr_t mh_paddr;               /* physical address of mbuf */
        short   mh_type;                /* type of data in this mbuf */
};

So... that should be the mh_len field.

The code does this:

static int
m_xhalf(const struct mbuf *m, uint32_t k, int *err)
{
        int len;
        u_char *cp;
        struct mbuf *m0;

        *err = 1;
        MINDEX(len, m, k);
        cp = mtod(m, u_char *) + k;
        if (len >= k + 2) {
                *err = 0;
                return EXTRACT_SHORT(cp);
        }
        m0 = m->m_next;
        if (m0 == 0)
                return 0;
        *err = 0;
        return (cp[0] << 8) | mtod(m0, u_char *)[0];
}

Looks like it's probably bombing inside MINDEX.  

Hm.  I don't see how this could possibly be happening since the mbuf is 
manipulated just before the call to bpf_mtap:

                m = rxs->rxs_mbuf;
                if (gem_add_rxbuf(sc, i) != 0) {
                        GEM_COUNTER_INCR(sc, sc_ev_rxnobuf);
                        ifp->if_ierrors++;
                        GEM_INIT_RXDESC(sc, i);
                        bus_dmamap_sync(sc->sc_dmatag, rxs->rxs_dmamap, 0,
                            rxs->rxs_dmamap->dm_mapsize, 
BUS_DMASYNC_PREREAD);
                        continue;
                }
                m->m_data += 2; /* We're already off by two */

                m->m_pkthdr.rcvif = ifp;
                m->m_pkthdr.len = m->m_len = len;

                /*
                 * Pass this up to any BPF listeners, but only
                 * pass it up the stack if it's for us.
                 */
                bpf_mtap(ifp, m);


That code should also get an alignment fault if the mbuf is not aligned.  
I think it's probably an issue with bpf.



Eduardo

Follow-Ups:
- Re: Trap 0x34 panic on 5.1 on Blade 100
  - From: George Harvey

References:
- Trap 0x34 panic on 5.1 on Blade 100
  - From: George Harvey
- Re: Trap 0x34 panic on 5.1 on Blade 100
  - From: Martin Husemann
- Re: Trap 0x34 panic on 5.1 on Blade 100
  - From: George Harvey
- Re: Trap 0x34 panic on 5.1 on Blade 100
  - From: George Harvey

Prev by Date: Re: Trap 0x34 panic on 5.1 on Blade 100
Next by Date: Re: Trap 0x34 panic on 5.1 on Blade 100
Previous by Thread: Re: Trap 0x34 panic on 5.1 on Blade 100
Next by Thread: Re: Trap 0x34 panic on 5.1 on Blade 100
Indexes:

Home | Main Index | Thread Index | Old Index