On Jan 7, 2014, at 5:29 AM, David Holland <dholland-tech%netbsd.org@localhost> wrote: > On Tue, Dec 31, 2013 at 11:34:27AM +0100, J. Hannken-Illjes wrote: >>>> The layered file systems hashlists currently have to work on locked >>>> vnodes as the vnode operations returning vnodes return them locked. >>>> >>>> This leads to some very dirty hacks -- see layer_node_find() from >>>> sys/miscfs/genfs/layer_subr.c for example. >>>> >>>> The best solution is to change these vnode operations to return >>>> unlocked (and referenced) vnodes. >>>> >>>> The attached diff implements this change for operations create, mknod, >>>> mkdir and symlink. It passes the atf test suite and some file system >>>> stress tests I use here. >>>> >>>> Comments or objections anyone? >>> >>> My first thought is that I really don't think it's a good idea to >>> change important semantic properties (particularly locking) of calls >>> from what's effectively a public API without changing the signatures >>> too so old code won't compile. >> >> Ok. Will add an API version argument to vnode_if.src so >> >> vop_mkdir { >> + VERSION 1 >> IN LOCKED=YES WILLPUT struct vnode *dvp; >> >> will rename vop_mkdir_desc to vop_mkdir1_desc and then change all >> file systems to use the new desc. > > While on the whole I suppose this is probably a good thing, it feels > inelegant. Also, the ops table is typically a long way from the code > that actually needs to be changed. > > Wouldn't it be better to attach the version number to the name of the > args structure? Then it will fail to compile in the VOP function that > needs attention. That is, > > int > ufs_mkdir(void *v) > { > struct vop_mkdir_args /* { > struct vnode *a_dvp; > struct vnode **a_vpp; > struct componentname *a_cnp; > struct vattr *a_vap; > } */ *ap = v; > : > : > } > > would need to be changed to e.g. > > int > ufs_mkdir(void *v) > { > struct vop_mkdir_args_v2 /* { > struct vnode *a_dvp; > struct vnode **a_vpp; > struct componentname *a_cnp; > struct vattr *a_vap; > } */ *ap = v; > : > : > } > > (also, while this is minor I think I'd rather have vop_mkdir_args_v2 > and/or vop_mkdir_desc_v2 rather than vop_mkdir2_args and > vop_mkdir2_desc. The version doesn't need to be part of the operation > name.) Looks very good -- changed vnode_if.sh to create xxx_v2_args. Diff attached. >>> Also, since what you're doing is basically unlocking before return in >>> every implementation, and then relocking after return at every call >>> site, it's possible to do one op at a time; it seems like that would >>> be a good idea in case unexpected problems crop up. >> >> This is not what I'm doing. > > It still seems like a good idea to do one op at a time, or at least do > one op first. This is missing any arguments for the second time and you ignored my arguments after "This is not what I'm doing.". >>> Unlocking and relocking should not itself be a problem since AFAIK >>> there isn't any reason for those locks to be held in the first place; >>> however, I think I'd better go review all the call sites just in case. >>> >>> Plus I think it would be advisable to make this change first, so that >>> in case we do find a site where it matters we can lock the returned >>> object before anyone else can see it: >>> >>> # >>> -#% mkdir dvp L U U >>> +#% mkdir dvp L L L >> >> Where should it matter? VOP_MKDIR is only used from: >> >> external/cddl/osnet/dist/uts/common/fs/zfs/zfs_replay.c:zfs_replay_create() >> sys/kern/vfs_syscalls.c:do_sys_mkdirat() >> sys/nfs/nfs_serv.c:nfsrv_mkdir() >> sys/fs/union/union_vnops.c:union_mkdir() >> sys/fs/union/union_subr.c:union_mkshadow() >> sys/rump/librump/rumpvfs/rumpvnode_if.c:RUMP_VOP_MKDIR() >> >> and I don't see any problems here. > > I have been putting off answering this because I ought to examine all > these, and I haven't managed to do so yet but figured I ought to > answer the rest. An updated diff at http://www.netbsd.org/~hannken/vnode-pass2a-2.diff Diffs to vnode_if.{sh,src} attached. -- J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig (Germany)
Attachment:
vnode-pass2a-new.diff
Description: Binary data