Re: [PATCH net-next v19 06/13] memory-provider: dmabuf devmem memory provider

From: Mina Almasry <hidden>
Date: 2024-08-16 12:20:58
Also in: bpf, dri-devel, linux-alpha, linux-arch, linux-doc, linux-kselftest, linux-media, linux-mips, linux-trace-kernel, lkml, sparclinux

On Thu, Aug 15, 2024 at 9:22 PM Jakub Kicinski [off-list ref] wrote:

On Wed, 14 Aug 2024 17:32:53 +0100 Pavel Begunkov wrote:

quoted

This is where I get a bit confused. Jakub did mention that it is
desirable for core to verify that the driver did the right thing,
instead of trusting that a driver did the right thing without
verifying. Relying on a flag from the driver opens the door for the
driver to say "I support this" but actually not create the mp
page_pool. In my mind the explicit check is superior to getting
feedback from the driver.

You can apply the same argument to anything, but not like
after each for example ->ndo_start_xmit we dig into the
interface's pending queue to make sure it was actually queued.

And even if you check that there is a page pool, the driver
can just create an empty pool that it'll never use. There
are always ways to make it wrong.

Yes, there is a difference, and I'm not against it as a
WARN_ON_ONCE after failing it in a more explicit way.

Jakub might have a different opinion on how it should look
like, and we can clarify on that, but I do believe it's a
confusing interface that can be easily made better.

My queue API RFC patches had configuration arguments, not sure if this
is the right version but you'll get the idea:
https://github.com/kuba-moo/linux/blob/qcfg/include/net/netdev_cfg.h#L43-L50
This way we can _tell_ the driver what the config should be. That part
got lost somewhere along the way, because perhaps in its embryonic form
it doesn't make sense.

We can bring it back, add HDS with threshold of 0, to it, and a bit for
non-readable memory. On top of that "capability bits" in struct
netdev_queue_mgmt_ops to mark that the driver pays attention to particular
fields of the config.

Not sure if it should block the series, but that'd be the way I'd do it
(for now?)

I'm not sure I want to go into a rabbit hole of adding configuration
via the queue API, blocking this series . We had discussed this months
back and figured that it's a significant undertaking on its own. I'm
not sure GVE has HDS threshold capability for example, and I'm also
not sure how to coexist header split negotiability via the queue API
when an ethtool API exists alongside it. I think this is worthy of
separating in its own follow up series.

For now detecting that the driver was able to create the page_pool
with the correct memory provider in core should be sufficient. Also
asking the driver to set a
netdev_rx_queue->unreadable_netmem_supported flag should also be
sufficient. I've implemented both locally and they work well.

I'd keep the current check with a WARN_ON_ONCE(), tho.
Given the absence of tests driver developers can use.
Especially those who _aren't_ supporting the feature.

Yes what I have locally is the driver setting
netdev_rx_queue->unreadable_netmem_supported when header split is
turned on, and additionally a WARN_ON_ONCE around the check in core. I
was about to send that when I read your email. I'm hoping we don't
have to go through the scope creep of adding configuration via the
queue API, which I think is a very significant undertaking.

quoted

and cons to each approach; I don't see a showstopping reason to go
with one over the other.

quoted

And page_pool_check_memory_provider() is not that straightforward,
it doesn't walk through pools of a queue.

Right, we don't save the pp of a queue, only a netdev. The outer loop
checks all the pps of the netdev to find one with the correct binding,
and the inner loop checks that this binding is attached to the correct
queue.

That's the thing, I doubt about the second part.

net_devmem_bind_dmabuf_to_queue() {
      err = xa_alloc(&binding->bound_rxqs, &xa_idx, rxq);
      if (err)
              return err;

      netdev_rx_queue_restart();

      // page_pool_check_memory_provider
      ...
      xa_for_each(&binding->bound_rxqs, xa_idx, binding_rxq) {
              if (rxq == binding_rxq)
                      return success;
}

Can't b4 the patches for some reason, but that's the highlight
from the patchset, correct me if I'm wrong. That xa_for_each
check is always true because you put the queue in there right
before it, and I don't that anyone could've erased it.

The problem here is that it seems the ->bound_rxqs state doesn't
depend on what page pools were actually created and with what mp.

FWIW I don't understand the point of walking the xa either.
Just check the queue number of the pp you found matches,
page pool params are saved in the page pool. No?

Yes, I changed this check to check pool->p.queue, and it works fine.

-- 
Thanks,
Mina

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help