Thread (20 messages) 20 messages, 6 authors, 2024-01-18

Re: [RFC PATCH net-next v5 2/2] net: add netmem to skb_frag_t

From: Mina Almasry <hidden>
Date: 2024-01-15 23:23:48
Also in: lkml

On Mon, Jan 15, 2024 at 1:37 AM Yunsheng Lin [off-list ref] wrote:
On 2024/1/12 23:35, Mina Almasry wrote:
quoted
On Fri, Jan 12, 2024 at 3:51 AM Yunsheng Lin [off-list ref] wrote:
quoted
On 2024/1/12 8:34, Mina Almasry wrote:
quoted
On Thu, Jan 11, 2024 at 4:45 AM Yunsheng Lin [off-list ref] wrote:
quoted
On 2024/1/9 9:14, Mina Almasry wrote:

...
quoted
+             if (WARN_ON_ONCE(!skb_frag_page(&skb_shinfo(skb)->frags[0]))) {
I am really hate to bring it up again.
If you are not willing to introduce a new helper,
I'm actually more than happy to add a new helper like:

static inline netmem_ref skb_frag_netmem();

For future callers to obtain frag->netmem to use the netmem_ref directly.

What I'm hung up on is really your follow up request:

"Is it possible to introduce something like skb_frag_netmem() for
netmem? so that we can keep most existing users of skb_frag_page()
unchanged and avoid adding additional checking overhead for existing
users."

With this patchseries, skb_frag_t no longer has a page pointer inside
of it, it only has a netmem_ref. The netmem_ref is currently always a
page, but in the future may not be a page. Can you clarify how we keep
skb_frag_page() unchanged and without checks? What do you expect
skb_frag_page() and its callers to do? We can not assume netmem_ref is
always a struct page. I'm happy to implement a change but I need to
understand it a bit better.

You did not answer my question that I asked here, and ignoring this
question is preventing us from making any forward progress on this
discussion. What do you expect or want skb_frag_page() to do when
there is no page in the frag?
I would expect it to do nothing.
I don't understand. skb_frag_page() with an empty implementation just
results in a compiler error as the function needs to return a page
pointer. Do you actually expect skb_frag_page() to unconditionally
cast frag->netmem to a page pointer? That was explained as
unacceptable over and over again by Jason and Christian as it risks
casting devmem to page; completely unacceptable and will get nacked.
Do you have a suggestion of what skb_frag_page() should do that will
not get nacked by mm?
IMHO, the caller is expected to only call skb_frag_page() for the frag
with normal page, for example, doing some 'readable' checking before
callingskb_frag_page(), or not doing any checking at all if it is called
in some existing driver not support devmem.
You did not specify what you actually want skb_frag_page() to do, but
IMO leaving it up to the programmer to guess/decide if it's safe to
use with no checking in the function is very error prone and hacky, as
the programmer is obviously liable to assume the wrong thing.

All skb_frag_page() callers (which are mainly core networking
actually) would need to do a check anyway to ensure the compiler type
safety, set as a hard requirement by mm & dmabuf maintainers, and at
that point having the check in the function makes sense.
quoted
quoted
There are still many existing places still not expecting or handling
skb_frag_page() returning NULL, mostly those are in the drivers not
supporting devmem,
As of this series skb_frag_page() cannot return NULL.

In the devmem series, all core networking stack places where
skb_frag_page() may return NULL are audited.

skb_frag_page() returning NULL in a driver that doesn't support devmem
is not possible. The driver is the one that creates the devmem frags
Is it possible a netdev supporting devmen and a netdev not supporting
devmen are put into the same bridge, and a rx skb from netdev supporting
devmen is forwarded to netdev not supporting devmem?

br_forward() or ip_forward() may be the place that might do this kind
of forwarding?
I'd need to check, but even at that point it sounds to me like this
forwarding code would need to check handle devmem or disabled for
devmem, not a global audit of drivers.
quoted
in the first place. When the driver author adds devmem support, they
should also add support for skb_frag_page() returning NULL.
quoted
what's the point of adding the extral overhead for
those driver?
There is no overhead with static branches. The checks are no-op unless
the user enables devmem, at which point the devmem connections see no
no-op is still some cpu instruction that will be replaced by some jumping
instruction when devmem is enabled, right?
No, there is no overhead with static branches. There seems to be a
misunderstanding of how they work causing this massive overestimation
of the 'overhead' of the checks. Branches in code are free or almost
free when they are correctly predicted by the compiler:

if (unlikely(check()))
   do_work();
continue();

Is free or almost free when check() is false, as the compiler assumes
check() is false and starts executing continue() immediately. There is
a minor perf hit when the branch is mispredicted as the CPU needs to
start executing do_work() after the check() result is available.
Static branches enable us to at runtime decide which branch the
compiler predicts. We could put devmem processing to always be in the
unlikely path without static branches, but that's not necessary.
Optimizing devmem processing with static branches when devmem is
enabled seems the right choice here. That's my understanding at least.
Maybe Alexander had better words for those kinds of overhead:
"The overhead may not seem like much, but when you are having to deal
with it per page and you are processing pages at millions per second it
can quickly start to add up."
This IMO would be relevant if we're adding (significant, or
measureable) overhead to page processing, but with static branches or
putting devmem in the unlikely path we are not.
quoted
overhead and non-devmem connections will see minimal overhead that I
suspect will not reproduce any perf issue. If the user is not fine
with that they can simply not enable devmem and continue to not
experience any overhead.
quoted
The networking stack should forbid skb with devmem frag being forwarded
to drivers not supporting devmem yet. I am sure if this is done properly
in your patchset yet? if not, I think you might to audit every existing
drivers handling skb_frag_page() returning NULL correctly.
There is no audit required. The devmem frags are generated by the
driver and forwarded to the TCP stack, not the other way around.
quoted
quoted

-- 
Thanks,
Mina
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help