tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: VOP_GETATTR: locking protocol change proposal
hi,
> On Nov 22, 2011, at 2:17 AM, YAMAMOTO Takashi wrote:
>
>> hi,
>>
>>> On Fri, Nov 18, 2011 at 06:31:21AM +0000, YAMAMOTO Takashi wrote:
>>>>>> postgresql assumes instant lseek(SEEK_END) to get the size of
>>>>>> their heap files.
>>>>>>
>>>>>> http://rhaas.blogspot.com/2011/11/linux-lseek-scalability.html
>>>>>>
>>>>>> as fsync etc keeps the vnode lock during i/o, it might cause severe
>>>>>> performance regression.
>>>>>
>>>>> I wonder if it is worth having a separate VOP for that, which would
>>>>> retrieve a subset of vattr without lock held. There are potentially
>>>>> more uses in the tree.
>>>>
>>>> while it's good to have VOP_GETATTR take a mask to specify a set of
>>>> requested attributes, i don't think it's worth to have a serparete VOP
>>>> for this. (PR/30250 is related)
>>>>
>>>> IMO we should make unlocked VOP calls safe (against revoke and umount -f)
>>>> and put VOP_GETATTR locking back into filesystem internal.
>>>
>>> ad looked into that and concluded it was prohibitively expensive; too
>>> much contention on every entry/exit.
>>
>> what's "it"?
>>
>> i agree naive reference counting on each VOP calls can be too expensive.
>
> If "it" looked like the attached sketch, are two atomic adds per call too
> expensive? Could even make it cheaper if VOP_LOCK() only takes the pre-op
> block, VOP_UNLOCK() only takes the post-op block and VOPs with a locked vnode
> take none of them making VOP_LOCK, VOP_XXX ... VOP_UNLOCK sequences atomic
> with regard to revoke.
it doesn't seem to be much cheaper than a normal rwlock, which would involve
the same number of atomic ops in uncontended cases.
YAMAMOTO Takashi
>
> --
> Juergen Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
> (Germany)
>
> #define VWANTREVOKE 0x40000000
>
> revoke()
> ...
> oc = atomic_add_32_nv(&vp->v_opcnt, VWANTREVOKE);
> while (oc != VWANTREVOKE) {
> cv_wait(&vp->v_cv, vp->v_interlock);
> oc = vp->v_opcnt;
> }
> ...
> atomic_add_32(&vp->v_opcnt, -VWANTREVOKE);
> cv_broadcast(&vp->v_cv);
>
>
> vop_xxx()
> ...
> /* pre-op */
> for (;;) {
> oc = atomic_add_32_nv(&vp->v_opcnt, 1);
> if (predict_true((oc & VWANTREVOKE) == 0)
> break;
> oc = atomic_add_32_nv(&vp->v_opcnt, -1);
> mutex_enter(&vp->v_interlock);
> while (oc & VWANTREVOKE) {
> cv_wait(&vp->v_cv, vp->v_interlock);
> oc = vp->v_opcnt;
> }
> mutex_exit(&vp->v_interlock);
> }
> ...
> VCALL(...)
> ...
> /* post-op */
> oc = atomic_add_32_nv(&vp->v_opcnt, -1);
> if (predict_false(oc == VWANTREVOKE)) {
> mutex_enter(&vp->v_interlock);
> cv_broadcast(&vp->v_cv);
> mutex_exit(&vp->v_interlock);
> }
Home |
Main Index |
Thread Index |
Old Index