Thread (11 messages) 11 messages, 5 authors, 2015-02-10

Re: [PATCH] llist: Fix missing lockless_dereference()

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: 2015-02-10 03:42:10
Also in: lkml, stable

----- Original Message -----
From: "Huang Ying" <redacted>
To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
Cc: "Michael Cree" <redacted>, "Greg KH" <gregkh@linuxfoundation.org>, linux-alpha@vger.kernel.org,
"Richard Henderson" [off-list ref], "Ivan Kokshaysky" [off-list ref], "Matt Turner"
[off-list ref], linux-kernel@vger.kernel.org, "Paul McKenney" [off-list ref], "David Howells"
[off-list ref], "Pranith Kumar" [off-list ref], stable@vger.kernel.org
Sent: Monday, February 9, 2015 8:52:28 PM
Subject: Re: [PATCH] llist: Fix missing lockless_dereference()

Hi, Mathieu,

On Sun, 2015-02-08 at 04:25 +0000, Mathieu Desnoyers wrote:
quoted
----- Original Message -----
quoted
From: "Michael Cree" <redacted>
To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
Cc: "Greg KH" <gregkh@linuxfoundation.org>, linux-alpha@vger.kernel.org,
"Richard Henderson" [off-list ref], "Ivan
Kokshaysky" [off-list ref], "Matt Turner"
[off-list ref], "Huang Ying" [off-list ref],
linux-kernel@vger.kernel.org, "Paul McKenney"
[off-list ref], "David Howells" [off-list ref],
"Pranith Kumar" [off-list ref], stable@vger.kernel.org
Sent: Saturday, February 7, 2015 7:47:29 PM
Subject: Re: [PATCH] llist: Fix missing lockless_dereference()

On Sat, Feb 07, 2015 at 10:30:44PM +0000, Mathieu Desnoyers wrote:
quoted
quoted
On Fri, Feb 06, 2015 at 09:08:21PM -0500, Mathieu Desnoyers wrote:
quoted
A lockless_dereference() appears to be missing in
llist_del_first().
It should only matter for Alpha in practice.
What could one anticipate to be the symptoms of such a missing
lockless_dereference()?
This can trigger corruption of the lockless linked-list, which is
used across a few subsystems. AFAIU, the scenario is as follows.
Please bear with me, because it's been a while since I've read on
the Alpha multi-cache-banks behavior.

The list here would be initially non-empty. Initial state of
new_last->next is unset (newly allocated); IOW: garbage. CPU A
adds a node into the list while CPU B removes a node from the
head of the list.

CPU A                                      CPU B
llist_add_batch()
- Stores to new_last->next
- implicit full mb before cmpxchg makes the
  update to CPU A's cache bank containing
  new_last->next visible to other CPUs
  before CPU A's cache bank update making
  head->first visible to other CPUs.
- cmpxchg updates head->first = new_first
                                           llist_del_first()
                                           - entry = load head->first
                                           -> here, lack of barrier on
                                           Alpha creates a window where
                                              CPU B's cache bank can see
                                              the updated "head->first",
                                              but the cache bank holding
                                              the next value did not
                                              receive the update yet, since
                                              each cache bank have
                                              their own channel, which can
                                              be independently
                                              saturated.
                                           - next = load entry->next
                                           (dereference entry pointer)
                                           - cmpxchg updates head->first =
                                           next
                                             -> can store unset "next"
                                             value into head->first, thus
                                                corrupting the linked list.
If my understanding were correct, cmpxchg will imply a full mb before
and after it, so that there is a mb between load head->first in cmpxchg
and load entry->next.  If so, the memory barrier is only needed before
the loop.
Yes, indeed, and by using lockless_dereference(), this is
what we end up doing.

FWIW, the reason why I moved smp_read_barrier_depends() into
the loop was to issue it after the check for NULL pointer,
assuming that getting a NULL pointer was a relatively
frequent case compared to a failing cmpxchg. But we're
talking about very minor optimisations compared to the
upside of lockless_dereference() making the code easier
to understand.

Thanks,

Mathieu
Best Regards,
Huang, Ying
quoted
quoted
The Alpha kernel is behaving pretty well provided one builds a machine
specific kernel and UP.  When running an SMP kernel some packages
(most notably the java runtime, but there are a few others) occasionally
lock up in a pthread call --- could be a problem in libc rather then the
kernel.
Are those lockups always occasional, or you have ways to reproduce them
frequently with stress-tests ?

Thanks,

Mathieu
quoted
quoted
quoted
Meta-comment, do we really care about Alpha anymore?  Is it still
consered an "active" arch we support?
There are a few of us still running recent kernels on Alpha.  I am
maintaining the unofficial Debian alpha port at debian-ports, and the
Debian popcon shows about 10 installations of Debian Alpha.

Cheers
Michael.
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help