NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/53096: netbsd-8 crash on heavy disk I/O
On Sun, Mar 18, 2018 at 04:50:01PM +0000, J. Hannken-Illjes wrote:
> The following reply was made to PR kern/53096; it has been noted by GNATS.
>
> From: "J. Hannken-Illjes" <hannken%eis.cs.tu-bs.de@localhost>
> To: gnats-bugs%NetBSD.org@localhost
> Cc:
> Subject: Re: kern/53096: netbsd-8 crash on heavy disk I/O
> Date: Sun, 18 Mar 2018 17:45:41 +0100
>
> The backtrace is a bit misleading, it really is:
>
> sys_chdir() -> vrele() -> vrelel() -> vstate_assert_change() -> vnpanic()
>
> This matches the panic from dmesg:
>
> ...
> cpu 0: ucode 0x1a->0x29
> cpu 1: ucode 0x1a->0x29
> cpu 2: ucode 0x1a->0x29
> cpu 3: ucode 0x1a->0x29
> vnode 0xfffffe82137bde70 flags 0x30<MPSAFE,LOCKSWORK>
> tag VT_UFS(1) type VDIR(2) mount 0xfffffe823dbb2008 typedata 0x0
> usecount 1 writecount 0 holdcount 1
> size 200 writesize 200 numoutput 0
> data 0xfffffe8213cce900 lock 0xfffffe82137bdfa0
> state BLOCKED key(0xfffffe823dbb2008 8) b1 c8 3a 00 00 00 00 00
> lrulisthd 0xffffffff814c6400
> tag VT_UFS, ino 3852465, on dev 0, 0 flags 0x0, nlink 3
> mode 040755, owner 1001, group 0, size 512
> panic: BLOCKED to LOADED with usecount 2 at vrelel:783
>
> Here vrelel() is:
>
> 767 VSTATE_CHANGE(vp, VS_LOADED, VS_BLOCKED);
> 768 mutex_exit(vp->v_interlock);
> ...
> 778 recycle = false;
> 779 VOP_INACTIVE(vp, &recycle);
> 780 if (!recycle)
> 781 VOP_UNLOCK(vp);
> 782 mutex_enter(vp->v_interlock);
> 783 VSTATE_CHANGE(vp, VS_BLOCKED, VS_LOADED);
>
> and VSTATE_CHANGE() expands to vstate_assert_change(), which is:
>
> 315 KASSERTMSG(mutex_owned(vp->v_interlock), "at %s:%d", func, line);
>
> 328 if ((from == VS_BLOCKED || to == VS_BLOCKED) && vp->v_usecount != 1)
> 329 vnpanic(vp, "%s to %s with usecount %d at %s:%d",
>
> So the usecount of a blocked vnode with interlock held changed from 1,
> it is "2" on the call to vnpanic() and "1" when vnpanic prints
> the vnode.
>
> As vcache_vget() and vcache_tryvget() either error out or wait if the current
> state is BLOCKED it could be a vref() without a prior reference.
>
> Please try the attached patch to see if one of these assertions fire.
>
> diff -r 13173af16202 -r 0a76936d2ed0 sys/kern/vfs_vnode.c
> --- sys/kern/vfs_vnode.c
> +++ sys/kern/vfs_vnode.c
> @@ -670,11 +670,22 @@ static inline bool
> vtryrele(vnode_t *vp)
> {
> u_int use, next;
> + vnode_impl_t *vip = VNODE_TO_VIMPL(vp);
>
> for (use = vp->v_usecount;; use = next) {
> if (use == 1) {
> return false;
> }
> +
> + membar_enter();
> + if (vip->vi_state == VS_BLOCKED) {
> + mutex_enter(vp->v_interlock);
> + if (vip->vi_state == VS_BLOCKED) {
> + vnpanic(vp, "vtryrele on BLOCKED vnode");
> + }
> + mutex_exit(vp->v_interlock);
> + }
> +
> KASSERT(use > 1);
> next = atomic_cas_uint(&vp->v_usecount, use, use - 1);
> if (__predict_true(next == use)) {
> @@ -865,6 +876,16 @@ vrele_async(vnode_t *vp)
> void
> vref(vnode_t *vp)
> {
> + vnode_impl_t *vip = VNODE_TO_VIMPL(vp);
> +
> + membar_enter();
> + if (vip->vi_state == VS_BLOCKED) {
> + mutex_enter(vp->v_interlock);
> + if (vip->vi_state == VS_BLOCKED) {
> + vnpanic(vp, "vref on BLOCKED vnode");
> + }
> + mutex_exit(vp->v_interlock);
> + }
>
> KASSERT(vp->v_usecount != 0);
>
Should I apply the patch to current netbsd-8 or the version on which I
could reproduce the crashes? I ask because I've updated a couple of
times since my report and I haven't seen the crashes since the
updates.
--
Roy Bixler <rcbixler%nyx.net@localhost>
"The fundamental principle of science, the definition almost, is this: the
sole test of the validity of any idea is experiment."
-- Richard P. Feynman
Home |
Main Index |
Thread Index |
Old Index