Thread (22 messages) 22 messages, 9 authors, 2021-03-24

Re: [Linuxarm] Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

From: Yunsheng Lin <hidden>
Date: 2021-03-22 00:56:08
Also in: lkml, netdev

On 2021/3/20 2:15, Cong Wang wrote:
On Thu, Mar 18, 2021 at 12:33 AM Yunsheng Lin [off-list ref] wrote:
quoted
On 2021/3/17 21:45, Jason A. Donenfeld wrote:
quoted
On 3/17/21, Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
Cong Wang [off-list ref] writes:
quoted
On Mon, Mar 15, 2021 at 2:07 PM Jakub Kicinski [off-list ref] wrote:
quoted
I thought pfifo was supposed to be "lockless" and this change
re-introduces a lock between producer and consumer, no?
It has never been truly lockless, it uses two spinlocks in the ring
buffer
implementation, and it introduced a q->seqlock recently, with this patch
now we have priv->lock, 4 locks in total. So our "lockless" qdisc ends
up having more locks than others. ;) I don't think we are going to a
right direction...
Just a thought, have you guys considered adopting the lockless MSPC ring
buffer recently introduced into Wireguard in commit:

8b5553ace83c ("wireguard: queueing: get rid of per-peer ring buffers")

Jason indicated he was willing to work on generalising it into a
reusable library if there was a use case for it. I haven't quite though
through the details of whether this would be such a use case, but
figured I'd at least mention it :)
That offer definitely still stands. Generalization sounds like a lot of fun.

Keep in mind though that it's an eventually consistent queue, not an
immediately consistent one, so that might not match all use cases. It
works with wg because we always trigger the reader thread anew when it
finishes, but that doesn't apply to everyone's queueing setup.
Thanks for mentioning this.

"multi-producer, single-consumer" seems to match the lockless qdisc's
paradigm too, for now concurrent enqueuing/dequeuing to the pfifo_fast's
queues() is not allowed, it is protected by producer_lock or consumer_lock.

So it would be good to has lockless concurrent enqueuing, while dequeuing
can be protected by qdisc_lock() or q->seqlock, which meets the "multi-producer,
single-consumer" paradigm.
I don't think so. Usually we have one queue for each CPU so we can expect
each CPU has a lockless qdisc assigned, but we can not assume this in
the code, so we still have to deal with multiple CPU's sharing a lockless qdisc,
and we usually enqueue and dequeue in process context, so it means we could
have multiple producers and multiple consumers.
For lockless qdisc, dequeuing is always within the qdisc_run_begin() and
qdisc_run_end(), so multiple consumers is protected with each other by
q->seqlock .

For enqueuing, multiple consumers is protected by producer_lock, see
pfifo_fast_enqueue() -> skb_array_produce() -> ptr_ring_produce().
I am not sure if lockless MSPC can work with the process context, but
even if not, the enqueuing is also protected by rcu_read_lock_bh(),
which provides some kind of atomicity, so that producer_lock can be
reomved when lockless MSPC is used.
On the other hand, I don't think the problems we have been fixing are the ring
buffer implementation itself, they are about the high-level qdisc
state transitions.

Thanks.

.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help