Thread (32 messages) 32 messages, 3 authors, 2024-07-11

Re: [PATCH net-next v16 04/13] netdev: netdevice devmem allocator

From: Mina Almasry <hidden>
Date: 2024-07-10 22:45:35
Also in: bpf, dri-devel, linux-alpha, linux-arch, linux-doc, linux-kselftest, linux-media, linux-mips, lkml, netdev, sparclinux

On Wed, Jul 10, 2024 at 12:55 PM Jakub Kicinski [off-list ref] wrote:
On Wed, 10 Jul 2024 12:29:58 -0700 Mina Almasry wrote:
quoted
On Wed, Jul 10, 2024 at 9:37 AM Jakub Kicinski [off-list ref] wrote:
quoted
On Wed, 10 Jul 2024 00:17:37 +0000 Mina Almasry wrote:
quoted
+     net_devmem_dmabuf_binding_get(binding);
Why does every iov need to hold a ref? pp holds a ref and does its own
accounting, so it won't disappear unless all the pages are returned.
I guess it doesn't really need to, but this is the design/approach I
went with, and I actually prefer it a bit. The design is borrowed from
how struct dev_pagemap does this, IIRC. Every page allocated from the
pgmap holds a reference to the pgmap to ensure the pgmap doesn't go
away while some page that originated from it is out in the wild, and
similarly I did so in the binding here.
Oh, you napi_pp_put_page() on the other end! I can see how that could
be fine.
quoted
We could assume that the page_pool is accounting iovs for us, but that
is not always true, right? page_pool_return_page() disconnects a
netmem from the page_pool and AFAIU the page_pool can go away while
there is such a netmem still in use in the net stack. Currently this
can't happen with iovs because I currently don't support non-pp
refcounting for iovs (so they're always recyclable), but you have a
comment on the other patch asking why that works; depending on how we
converge on that conversation, the details of how the pp refcounting
could change.
Even then - we could take the ref as the page "leaks" out of the pool,
rather than doing it on the fast path, right? Or just BUG_ON() 'cause
that reference ain't coming back ;)
OK, I'll see how the conversation on the other thread converges
vis-a-vis net_iov refcounting happens, and then look at if I can avoid
the binding_get/put per page in that framework.
quoted
It's nice to know that the binding refcounting will work regardless of
the details of how the pp refcounting works. IMHO having the binding
rely on the pp refcounting to ensure all the iovs are freed introduces
some fragility.

Additionally IMO the net_devmem_dmabuf_binding_get/put aren't so
expensive to want to optimize out, right? The allocation is a slow
path anyway and the fast path recycles netmem.
Yes, I should have read patch 10. I think it's avoidable :) but with
recycling it can indeed perform just fine (do you happen to have
recycling rate stats from prod runs?)
I don't to be honest. For a couple of reasons, one is that gcloud VMs
where we mainly use this, these stats are private to the VM and is not
something I can query widly. I only get access to the data when shared
with bug reports on specific issues.

In our internal test runs, I do not monitor the recycling rate to be
honest, as that is fine as long as the recycling is fast enough to
find available memory for incoming data. What I do look at very
closely is the allocation failure rate. That is when GVE tries to
alloc a new devmem but it's out of devmem (which would likely be due
to recycling not happening fast enough). The stat is `page_alloc_fail`
in ethtool -S for us and it's one of the first things I check when
things go wrong. It hasn't been the root cause for any of our issues
in reality.

--
Thanks,
Mina
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help