Some time ago I wrote about performance problems when doing high -j
build.sh and made few remarks about mutex implementation.
TL;DR for that one was mostly that cas returns the found value, so
explicit re-reads can be avoided. There are also avoidable explicit
barriers (yes, I know about the old Opteron errata).
I had another look and have a remark definitely worth looking at (and
easy to fix). Unfortunately I lost my test jig so can't benchmark.
That said, here it is:
After it is is concluded that lock owner sleeps:
ts = turnstile_lookup(mtx);
/*
* Once we have the turnstile chain interlock, mark the
* mutex has having waiters. If that fails, spin again:
* chances are that the mutex has been released.
*/
if (!MUTEX_SET_WAITERS(mtx, owner)) {
turnstile_exit(mtx);
owner = mtx->mtx_owner;
continue;
}
Note that the lock apart from being free, can be:
1. owned by the same owner, which is now running
In this case the bit is set spuriously and triggers slow path
unlock.
2. owned by a different owner, which is NOT running
Assuming the owner still has the lock and is not running after
turnstile_exit, this causes extra trip.
That said, proposed change is trivial and follows what FreeBSD is doing:
re-read and abort sleep unless the lock is owned by an off-cpu thread.
Note there is a minor optimisation of not setting the bit if already
seen.
That said, the minimal fix would look like this (untested):
ts = turnstile_lookup(mtx);
owner = mtx->mtx_owner;
if (mutex_oncpu(owner) || !MUTEX_SET_WAITERS(mtx, owner))
turnstile_exit(mtx);
owner = mtx->mtx_owner;
continue;
}
If value from cas was to be preserved it would definitely make sense
to re-test failed MUTEX_SET_WAITERS while still holding the turnstile lock.
It looks like rw locks can use a similar treatment.
Fun fact is that implementation on Illumos behaves worse in this regard:
it sets the waiters bit regardless *who* owns the lock (or whether they
are running), but then only goes to sleep if the *original* owner has
the lock.
--