tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: continued zfs-related lockups



Further to Greg and Simon's emails, I've started seeing issues with a formerly stable NetBSD 9 Xen domU ZFS system with 12GB RAM and 10TB of storage (5x 2TB). The change is that I've started using snapshots and zfs send.

The symptoms are that around 5am its two CPUs go to 100% and will not respond. I can hit return at the login prompt OK on the console, but any attempt at disk access will hang.

I can get into ddb and here is some output:

db{0}> show uvmexp
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12, ncolors=16
  3047796 VM pages: 8691 active, 4732 inactive, 1 wired, 2209566 free
  pages  3759 anon, 7211 file, 2454 exec
  freemin=512, free-target=682, wired-max=1015932
  resv-pg=1, resv-kernel=10, zeropages=0
  bootpages=89297, poolpages=609524
  cpu0:
    faults=12818533, traps=12909847, intrs=64192588, ctxswitch=97109178
    softint=19555687, syscalls=81007705
  cpu1:
    faults=7568098, traps=7624976, intrs=38860184, ctxswitch=83953114
    softint=3189879, syscalls=32186958
  fault counts:
    noram=0, noanon=0, pgwait=0, pgrele=0
ok relocks(total)=20498(20498), anget(retrys)=5235698(0), amapcopy=2920774 neighbor anon/obj pg=3410106/31552107, gets(lock/unlock)=11087963/20498 cases: anon=3032728, anoncow=2202960, obj=9084808, prcopy=2003178, przero=40
57301
  daemon and swap counts:
    woke=1, revs=60, scans=0, obscans=0, anscans=0
    busy=0, freed=0, reactivate=0, deactivate=4732
    pageouts=0, pending=0, nswget=0
    nswapdev=1, swpgavail=262143
    swpages=262143, swpginuse=0, swpgonly=0, paging=0
db{0}> ps /l
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
12320    1 2   0         0   ffffd5001df1b1a0                 sh
888      1 2   0         0   ffffd5002bf084c0                 sh
776      1 3   1        80   ffffd5001ab29b00                 sh wait
27334    1 3   1        80   ffffd50019a84aa0                 sh pipe_rd
5001     1 3   1         0   ffffd5001d9a40c0               find &arc_reclaim_wai
27852    1 3   1        80   ffffd5001d132340           postdrop netio
19403    1 3   0        80   ffffd5001d159040           sendmail pipe_rd
22908    1 3   1        80   ffffd5002bf05780                tee pipe_rd
10405    1 3   1        80   ffffd5001d09f700                 sh wait
18520    1 3   1        80   ffffd5009e5d1960                 sh wait
22037    1 3   1        80   ffffd5001d48b4a0               cron pipe_rd
2334     1 3   0        80   ffffd500b7480980             pickup kqueue
697      1 3   1        80   ffffd50018e1a580              getty ttyraw
698      6 5   0         0   ffffd5001f79e2c0            chronyd
698      1 3   0        80   ffffd5001f886740            chronyd select
534      1 3   0        80   ffffd5001f7b22e0               cron nanoslp
656      1 3   0        80   ffffd5001e8316a0              inetd kqueue
583      1 3   1        80   ffffd5001f79eb40               qmgr kqueue
605      1 3   0        80   ffffd5001f79e700             master kqueue
367      1 3   1        80   ffffd5001f7d26e0               sshd select
371      1 3   0        80   ffffd5001ebf86c0             powerd kqueue
306      1 3   1        80   ffffd5001ebf8b00                 sh wait
183      1 3   1        80   ffffd5001ebf8280            syslogd kqueue
1        1 3   1        80   ffffd5001975f1e0               init wait
0     1207 3   0       200   ffffd5001d9a4940    poolthread/-1@1 poolthrd
0     1205 3   0       200   ffffd5001d132780    poolthread/-1@1 poolthrd
0     1199 3   0       200   ffffd5001ab136a0    poolthread/-1@1 poolthrd
0     1187 3   0       200   ffffd5001ab13ae0    poolthread/-1@1 poolthrd
0     1186 3   0       200   ffffd50029cde320    metaslab_group_ &tq->tq_cv
0     1172 3   0       200   ffffd5002bf06040    zio_read_intr_2 &tq->tq_cv
0     1170 3   0       200   ffffd5001d9bb140    zio_read_intr_2 &tq->tq_cv
0     1169 3   0       200   ffffd5001e7ff680    zio_read_intr_2 &tq->tq_cv
0     1166 3   0       200   ffffd5001d9bc160    zio_read_intr_2 &tq->tq_cv
0     1163 3   0       200   ffffd5001ab2e2a0    zio_read_intr_2 &tq->tq_cv
0     1162 3   0       200   ffffd5001d9bc5a0    metaslab_group_ &tq->tq_cv
0     1160 3   0       200   ffffd5001d132bc0    metaslab_group_ &tq->tq_cv
0     1158 3   0       200   ffffd5001df1ca40    metaslab_group_ &tq->tq_cv
0     1153 3   1       200   ffffd5001dd0f660    zio_write_issue tstile
0     1152 3   0       200   ffffd501479865a0    zio_read_intr_0 &tq->tq_cv
0     1151 3   0       200   ffffd501479869e0    metaslab_group_ &tq->tq_cv
0     1150 5   1       200   ffffd5001d9a4500           (zombie)
0     1143 3   0       200   ffffd5002bf08900    zio_read_intr_6 &tq->tq_cv
0     1142 3   0       200   ffffd50019a29a80    zio_read_intr_6 &tq->tq_cv
0     1141 3   0       200   ffffd5001d9a8100    zio_read_intr_6 &tq->tq_cv
0     1140 3   0       200   ffffd5001d9a70e0    zio_read_intr_5 &tq->tq_cv
0     1139 3   0       200   ffffd5009801c0c0    zio_read_intr_5 &tq->tq_cv
0     1138 3   0       200   ffffd5001d48b8e0    zio_read_intr_5 &tq->tq_cv
0     1136 3   0       200   ffffd5001f886b80    zio_read_intr_1 &tq->tq_cv
0     1135 3   0       200   ffffd5001d9a10a0    zio_read_intr_1 &tq->tq_cv
0     1134 3   0       200   ffffd5001f7d2b20    zio_read_intr_1 &tq->tq_cv
0     1133 3   0       200   ffffd5001f886300    zio_read_intr_1 &tq->tq_cv
0     1132 3   0       200   ffffd5001f7b2b60    zio_read_intr_1 &tq->tq_cv
0     1131 3   0       200   ffffd5001d9a04c0    zio_read_intr_1 &tq->tq_cv
0      139 3   1       200   ffffd5001e5bb200    txg_sync_thread &zio->io_cv
0      138 3   1       200   ffffd5001df6ca60    txg_quiesce_thr &tx->tx_quiesce_
0       83 3   0       200   ffffd5001d0a7740           vdevsync vdevsync
0       82 3   1       200   ffffd5001d0a7b80           vdevsync vdevsync
0       81 3   1       200   ffffd5001d0a3720           vdevsync vdevsync
0       80 3   0       200   ffffd5001ab296c0           vdevsync vdevsync
0       77 3   0       200   ffffd5001d0a32e0           vdevsync vdevsync
0       76 3   0       200   ffffd5001d0a7300    pooloverseer/-1 poolover
0       75 3   1       200   ffffd5001d0a3b60    pooloverseer/-1 poolover
0       56 3   1       200   ffffd5001ab13260        spa_deadman spa_deadman
0       55 3   0       200   ffffd50019ab3ac0    l2arc_feed_thre &l2arc_feed_thr_
0       54 3   1       200   ffffd50019ab3680    dbuf_evict_thre &dbuf_evict_cv
0       53 3   1       200   ffffd50019a84660    pooloverseer/-1 poolover
0       52 3   1       200   ffffd50019a84220    arc_reclaim_thr xclocv
0       51 3   0       200   ffffd50019ab3240    pooloverseer/-1 poolover
0       50 3   0       200   ffffd50018e1a140            physiod physiod
0       49 3   1       200   ffffd50018e19120          pooldrain xclocv
0       48 3   1       200   ffffd50018e19560            ioflush &arc_reclaim_wai
0    >  47 7   0       200   ffffd50018e199a0           pgdaemon
0       44 3   0       200   ffffd50018e16540           aiodoned aiodoned
0       43 3   1       200   ffffd50018e1a9c0        xen_balloon xen_balloon
0       42 3   1       200   ffffd5001975f620             npfgc0 xchicv
0       41 3   1       200   ffffd50018e3aa40            rt_free rt_free
0       40 3   1       200   ffffd50018e3a600              unpgc unpgc
0       39 3   1       200   ffffd50018e3a1c0    key_timehandler key_timehandler
0       38 3   1       200   ffffd50018e35a20    icmp6_wqinput/1 icmp6_wqinput
0       37 3   0       200   ffffd50018e355e0    icmp6_wqinput/0 icmp6_wqinput
0       36 3   0       200   ffffd50018e351a0          nd6_timer nd6_timer
0       35 3   1       200   ffffd50018e2ba00     icmp_wqinput/1 icmp_wqinput
0       34 3   0       200   ffffd50018e2b5c0     icmp_wqinput/0 icmp_wqinput
0       33 3   1       200   ffffd50018e2b180           rt_timer rt_timer
0       32 3   1       200   ffffd50018e1b160        vmem_rehash vmem_rehash
0       31 3   1       200   ffffd50018e1b9e0             xenbus rdst
0       30 3   0       200   ffffd50018e1b5a0           xenwatch evtsq
0       20 3   1       200   ffffd50018e16100            xcall/1 xcall
0       19 1   1       200   ffffd50018e15960          softser/1
0       18 1   1       200   ffffd50018e15520          softclk/1
0       17 1   1       200   ffffd50018e150e0          softbio/1
0       16 1   1       200   ffffd50018e04940          softnet/1
0    >  15 7   1       201   ffffd50018e04500             idle/1
0       14 3   0       200   ffffd50018e040c0         pmfsuspend pmfsuspend
0       13 3   0       200   ffffd50018dfe920           pmfevent pmfevent
0       12 3   0       200   ffffd50018dfe4e0         sopendfree sopendfr
0       11 3   0       200   ffffd50018dfe0a0           nfssilly nfssilly
0       10 3   0       200   ffffd50015d26900            cachegc cachegc
0        9 3   1       200   ffffd50015d264c0             vdrain vdrain
0        8 3   0       200   ffffd50015d26080          modunload mod_unld
0        7 2   0       200   ffffd50015d1c8e0            xcall/0
0        6 1   0       200   ffffd50015d1c4a0          softser/0
0        5 1   0       200   ffffd50015d1c060          softclk/0
0        4 1   0       200   ffffd50015d198c0          softbio/0
0        3 1   0       200   ffffd50015d19480          softnet/0
0        2 1   0       201   ffffd50015d19040             idle/0
0        1 3   1       200   ffffffff8065f800            swapper uvm

Stephen



Home | Main Index | Thread Index | Old Index