Thread (31 messages) 31 messages, 7 authors, 2011-03-30

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

From: Andrey Kuzmin <hidden>
Date: 2011-03-25 11:13:38
Also in: lkml

On Fri, Mar 25, 2011 at 6:39 AM, Steven Rostedt [off-list ref] w=
rote:
On Thu, Mar 24, 2011 at 10:41:51AM +0100, Tejun Heo wrote:
quoted
Adaptive owner spinning used to be applied only to mutex_lock(). =C2=
=A0This
quoted
patch applies it also to mutex_trylock().

btrfs has developed custom locking to avoid excessive context switch=
es
quoted
in its btree implementation. =C2=A0Generally, doing away with the cu=
stom
quoted
implementation and just using the mutex shows better behavior;
however, there's an interesting distinction in the custom implementi=
on
quoted
of trylock. =C2=A0It distinguishes between simple trylock and tryspi=
n,
quoted
where the former just tries once and then fail while the latter does
some spinning before giving up.

Currently, mutex_trylock() doesn't use adaptive spinning. =C2=A0It t=
ries
quoted
just once. =C2=A0I got curious whether using adaptive spinning on
mutex_trylock() would be beneficial and it seems so, for btrfs anywa=
y.
quoted
The following results are from "dbench 50" run on an opteron two
socket eight core machine with 4GiB of memory and an OCZ vertex SSD.
During the run, disk stays mostly idle and all CPUs are fully occupi=
ed
quoted
and the difference in locking performance becomes quite visible.

SIMPLE is with the locking simplification patch[1] applied. =C2=A0i.=
e. it
quoted
basically just uses mutex. =C2=A0SPIN is with this patch applied on =
top -
quoted
mutex_trylock() uses adaptive spinning.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 USER =C2=A0 SYSTEM =C2=A0 SIRQ =C2=A0 =C2=
=A0CXTSW =C2=A0THROUGHPUT
quoted
=C2=A0SIMPLE 61107 =C2=A0354977 =C2=A0 =C2=A0217 =C2=A08099529 =C2=A0=
845.100 MB/sec
quoted
=C2=A0SPIN =C2=A0 63140 =C2=A0364888 =C2=A0 =C2=A0214 =C2=A06840527 =
=C2=A0879.077 MB/sec
quoted
On various runs, the adaptive spinning trylock consistently posts
higher throughput. =C2=A0The amount of difference varies but it outp=
erforms
quoted
consistently.

In general, using adaptive spinning on trylock makes sense as tryloc=
k
quoted
failure usually leads to costly unlock-relock sequence.

[1] http://article.gmane.org/gmane.comp.file-systems.btrfs/9658

Signed-off-by: Tejun Heo <tj@kernel.org>
I'm curious about the effects that this has on those places that do:

again:
=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(A);
=C2=A0 =C2=A0 =C2=A0 =C2=A0if (mutex_trylock(B)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_unlock(A=
);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto again;


Where the normal locking order is:
=C2=A0B -> A

If another location does:

=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(B);
=C2=A0 =C2=A0 =C2=A0 =C2=A0[...]
=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(A);

But another process has A already, and is running, it may spin waitin=
g
for A as A's owner is still running.

But now, mutex_trylock(B) becomes a spinner too, and since the B's ow=
ner
is running (spinning on A) it will spin as well waiting for A's owner=
 to
release it. Unfortunately, A's owner is also spinning waiting for B t=
o
release it.

If both A and B's owners are real time tasks, then boom! deadlock.
Turning try_lock into indefinitely spinning one breaks its semantics,
so deadlock is to be expected. But what's wrong in this scenario if
try_lock spins a bit before giving up?

Regards,
Andrey
-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help