Thread (8 messages) 8 messages, 3 authors, 2022-08-07

Re: [PATCH] net: skb content must be visible for lockless skb_peek() and its variations

From: Kirill Tkhai <hidden>
Date: 2022-08-07 20:18:07

On 01.08.2022 21:45, Kirill Tkhai wrote:
On 01.08.2022 10:39, Eric Dumazet wrote:
quoted
On Mon, Aug 1, 2022 at 9:00 AM Kirill Tkhai [off-list ref] wrote:
quoted
On 01.08.2022 09:52, Paolo Abeni wrote:
quoted
On Sun, 2022-07-31 at 23:39 +0300, Kirill Tkhai wrote:
quoted
From: Kirill Tkhai <redacted>

Currently, there are no barriers, and skb->xxx update may become invisible on cpu2.
In the below example var2 may point to intial_val0 instead of expected var1:

[cpu1]                                       [cpu2]
skb->xxx = initial_val0;
...
skb->xxx = var1;                     skb = READ_ONCE(prev_skb->next);
<no barrier>                         <no barrier>
WRITE_ONCE(prev_skb->next, skb);     var2 = skb->xxx;

This patch adds barriers and fixes the problem. Note, that __skb_peek() is not patched,
since it's a lowlevel function, and a caller has to understand the things it does (and
also __skb_peek() is used under queue lock in some places).

Signed-off-by: Kirill Tkhai <redacted>
---
Hi, David, Eric and other developers,

picking unix sockets code I found this problem,
Could you please report exactly how/where the problem maifests (e.g.
the involved call paths/time sequence)?
I didn't get why call paths in the patch description are not enough for you. Please, explain
what you want.
quoted
quoted
and for me it looks like it exists. If there
are arguments that everything is OK and it's expected, please, explain.
I don't see why such barriers are needed for the locked peek/tail
variants, as the spin_lock pair implies a full memory barrier.
This is for lockless skb_peek() calls and the patch is called in that way :). For locked skb_peek()
this is not needed. I'm not sure we need separate skb_peek() and skb_peek_lockless(). Do we?
We prefer explicit _lockless variants to document the precise points
they are needed.

A new helper (and its initial usage) will clearly point to the problem
you saw in af_unix.
The problem is:

unix_stream_sendmsg()	unix_stream_read_generic()
  skb->len = size;	  skb = skb_peek();
  skb_queue_tail(skb);	  unix_skb_len(skb); <- here we read wrong len
Oh, there are unix_state_lock(). Please, ignore this patch...
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help