tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: mbuf cluster leak?



On Mon, Mar 03, 2025 at 12:03:46PM -0000, Michael van Elst wrote:
> stix%stix.id.au@localhost (Paul Ripke) writes:
> 
> >while rsync'ing 1TiB+ to a local machine over 1Gbit/s ethernet.
> >I've doubled nmbclusters, but it'd be nice to know why the clusters
> >appear to be leaking.
> 
> >I assume MBUFTRACE might aid in hunting this down?
> 
> 
> Yes, I've fixed many places in the code recently where MBUFTRACE
> wasn't supported, it's still not complete.
> 
> 
> >I kicked off the rsync again after the doubling, and ran out again a
> >few hours later. Local interface is alc0, which happens to be in a
> >bridge with a bunch of tap interfaces. I also have npf, and altq
> >configured on a pppoe0 interface.
> 
> MBUFTRACE support is still missing in alc(4).
> 
> This patch might help (untested):
> 
> Index: if_alc.c
> ===================================================================
> RCS file: /cvsroot/src/sys/dev/pci/if_alc.c,v
> retrieving revision 1.55
> diff -p -u -r1.55 if_alc.c
> --- if_alc.c    5 Jul 2024 04:31:51 -0000       1.55
> +++ if_alc.c    3 Mar 2025 12:03:10 -0000
> @@ -2433,6 +2433,7 @@ alc_newbuf(struct alc_softc *sc, struct 
>         MGETHDR(m, init ? M_WAITOK : M_DONTWAIT, MT_DATA);
>         if (m == NULL)
>                 return (ENOBUFS);
> +       MCLAIM(m, &sc->sc_ec.ec_rx_mowner);
>         MCLGET(m, init ? M_WAITOK : M_DONTWAIT);
>         if (!(m->m_flags & M_EXT)) {
>                 m_freem(m);

Thanks!

I tried this patch first, being the simpler - and targeted to alc.

After a bunch of rsync's and 24h uptime, the stats are looking much
better:

ksh$ netstat -ms                         
2366 mbufs in use:
        1646 mbufs allocated to data
        685 mbufs allocated to packet headers
        35 mbufs allocated to socket names and addresses
18 calls to protocol drain routines
ksh$ vmstat -mCW | egrep '^Name|^mb|^mcl'
Name               Size     Requests    Fail     Releases    InUse    Avail      Pgreq      Pgrel    Npage PageSz   Hiwat Minpg    Maxpg    Idle   Flags   Util
mbpl                512        22513       0        16212     6301     5315       1557        105     1452   4096    1454     0      inf     663 0x10000  54.2%
mclpl              2048        13136       0         9714     3422     2196       3400        591     2809   4096    2813     0   261333    1098 0x10000  60.9%
Name          Spin GrpSz Full Emty PoolLayer CacheLayer  Hit%    CpuLayer  Hit%
mbpl            11    15  185    0     22513    3262288  99.3   271637917  98.8
mclpl           80    15  124    0     13136     994864  98.7    82293880  98.8

I don't eally have a good baseline to compare against, but I think
that's good enough to commit?

-- 
Paul Ripke
"Great minds discuss ideas, average minds discuss events, small minds
 discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.


Home | Main Index | Thread Index | Old Index