Thread (39 messages) 39 messages, 7 authors, 2025-02-20

Re: [PATCH net-next v3 0/6] Device memory TCP TX

From: Stanislav Fomichev <hidden>
Date: 2025-02-04 19:41:14
Also in: kvm, linux-doc, linux-kselftest, lkml, virtualization

On 02/04, Mina Almasry wrote:
On Tue, Feb 4, 2025 at 10:32 AM Paolo Abeni [off-list ref] wrote:
quoted
On 2/4/25 7:06 PM, Stanislav Fomichev wrote:
quoted
On 02/04, Mina Almasry wrote:
quoted
On Tue, Feb 4, 2025 at 4:32 AM Paolo Abeni [off-list ref] wrote:
quoted
On 2/3/25 11:39 PM, Mina Almasry wrote:
quoted
The TX path had been dropped from the Device Memory TCP patch series
post RFCv1 [1], to make that series slightly easier to review. This
series rebases the implementation of the TX path on top of the
net_iov/netmem framework agreed upon and merged. The motivation for
the feature is thoroughly described in the docs & cover letter of the
original proposal, so I don't repeat the lengthy descriptions here, but
they are available in [1].

Sending this series as RFC as the winder closure is immenient. I plan on
reposting as non-RFC once the tree re-opens, addressing any feedback
I receive in the meantime.
I guess you should drop this paragraph.
quoted
Full outline on usage of the TX path is detailed in the documentation
added in the first patch.

Test example is available via the kselftest included in the series as well.

The series is relatively small, as the TX path for this feature largely
piggybacks on the existing MSG_ZEROCOPY implementation.
It looks like no additional device level support is required. That is
IMHO so good up to suspicious level :)
It is correct no additional device level support is required. I don't
have any local changes to my driver to make this work. I think Stan
on-list was able to run the TX path (he commented on fixes to the test
but didn't say it doesn't work :D) and one other person was able to
run it offlist.
For BRCM I had shared this: https://lore.kernel.org/netdev/ZxAfWHk3aRWl-F31@mini-arch/ (local)
I have similar internal patch for mlx5 (will share after RX part gets
in). I agree that it seems like gve_unmap_packet needs some work to be more
careful to not unmap NIOVs (if you were testing against gve).
What happen if an user try to use devmem TX on a device not really
supporting it? Silent data corruption?
So the tx dma-buf binding netlink API will bind the dma-buf to the
netdevice. If that fails, the uapi will return failure and devmem tx
will not be enabled.

If the dma-binding succeeds, then the device can indeed DMA into the
dma-addrs in the device. The TX path will dma from the dma-addrs in
the device just fine and it need not be aware that the dma-addrs are
coming from a device and not from host memory.

The only issue that Stan's patches is pointing to, is that the driver
will likely be passing these dma-buf addresses into dma-mapping APIs
like dma_unmap_*() and dma_sync_*() functions. Those, AFAIU, will be
no-ops with dma-buf addresses in most setups, but it's not 100% safe
to pass those dma-buf addresses to these dma-mapping APIs, so we
should avoid these calls entirely.
quoted
Don't we need some way for the device to opt-in (or opt-out) and avoid
such issues?
Yeah, I think likely the driver needs to declare support (i.e. it's
not using dma-mapping API with dma-buf addresses).
netif_skb_features/ndo_features_check seems like a good fit?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help