Thread (19 messages) 19 messages, 6 authors, 2016-03-29

Re: [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

From: Lazaros Koromilas <hidden>
Date: 2016-03-28 15:48:08

Hi Olivier,

We could have two threads (running on different cores in the general
case) that both succeed the cmpset operation. In the dequeue path,
when n == 0, then cons_next == cons_head, and cmpset will always
succeed. Now, if they both see an old r->cons.tail value from a
previous dequeue, they can get stuck in the while
(unlikely(r->cons.tail != cons_head)) loop. I tried, however, to
reproduce (without the patch) and it seems that there is still a
window for deadlock.

I'm pasting some debug output below that shows two processes' state.
It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and
one sender doing mp_enqueue(32) on the same ring.

gdb --args ./ring-test -l0 --proc-type=primary
gdb --args ./ring-test -l1 --proc-type=secondary
gdb --args ./ring-test -l2 --proc-type=secondary -- tx

This is what I would usually see, process 0 and 1 both stuck at the same state:

663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 0
(gdb) p r->cons.tail
$2 = 576416
(gdb) p cons_head
$3 = 576448
(gdb) p cons_next
$4 = 576448

But now I managed to get the two processes stuck at this state too.

process 0:
663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 32
(gdb) p r->cons.tail
$2 = 254348832
(gdb) p cons_head
$3 = 254348864
(gdb) p cons_next
$4 = 254348896

proccess 1:
663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 32
(gdb) p r->cons.tail
$2 = 254348832
(gdb) p cons_head
$3 = 254348896
(gdb) p cons_next
$4 = 254348928

I haven't been able to trigger this with the patch so far, but it
should be possible.

Lazaros.

On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz [off-list ref] wrote:
Hi Lazaros,

On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
quoted
Issuing a zero objects dequeue with a single consumer has no effect.
Doing so with multiple consumers, can get more than one thread to succeed
the compare-and-set operation and observe starvation or even deadlock in
the while loop that checks for preceding dequeues.  The problematic piece
of code when n = 0:

    cons_next = cons_head + n;
    success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);

The same is possible on the enqueue path.
Just a question about this patch (that has been applied). Thomas
retitled the commit from your log message:

  ring: fix deadlock in zero object multi enqueue or dequeue
  http://dpdk.org/browse/dpdk/commit/?id=d0979646166e

I think this patch does not fix a deadlock, or did I miss something?

As explained in the following links, the ring may not perform well
if several threads running on the same cpu use it:

  http://dpdk.org/ml/archives/dev/2013-November/000714.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001070.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001162.html
  http://dpdk.org/ml/archives/dev/2015-July/020659.html

A deadlock could occur if the threads running on the same core
have different priority.

Regards,
Olivier
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help