Thread (24 messages) 24 messages, 6 authors, 2026-01-12

Re: [PATCH v4 4/9] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page

From: Will Deacon <will@kernel.org>
Date: 2026-01-12 15:23:10
Also in: lkml, virtualization

On Mon, Jan 12, 2026 at 03:48:11PM +0100, Stefano Garzarella wrote:
On Thu, Jan 08, 2026 at 05:33:42PM +0100, David Woodhouse wrote:
quoted
On Thu, 2025-07-17 at 10:01 +0100, Will Deacon wrote:
quoted
-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
+/* Dimension the RX SKB so that the entire thing fits exactly into
+ * a single 4KiB page. This avoids wasting memory due to alloc_skb()
+ * rounding up to the next page order and also means that we
+ * don't leave higher-order pages sitting around in the RX queue.
+ */
+#define
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	SKB_WITH_OVERHEAD(1024 * 4)
Should this be SKB_WITH_OVERHEAD()?
ehm, is what the patch is doing, no?
quoted
Or should it subtract VIRTIO_VSOCK_SKB_HEADROOM instead?
Why?

IIRC the goal of the patch was to have an SKB that fit entirely on one page,
to avoid wasting memory, so yes, we are reducing the payload a little bit
(4K vs 4K - VIRTIO_VSOCK_SKB_HEADROOM - SKB_OVERHEAD), but we are also
reducing segmentation.
quoted
(And also, I have use cases where I want to expand this to 64KiB. Can I
make it controllable with a sockopt? module param?)
What page size are you using? At some point I had this as PAGE_SIZE but
it wasn't popular:

https://lore.kernel.org/all/20250701201400.52442b0e@pumpkin/ (local)
I'm not sure about sockopt, because this is really device specific and can't
be linked to a specific socket, since the device will pre-fill the queue
with buffers that can be assigned to different sockets.

But yeah, perhaps a module parameter would suffice, provided that it can
only be modified at load time, otherwise we would have to do something
similar to NIC and ethtool, but I feel that would be too complicated for
this use case.
FWIW, we carried something similar in Android for a while on the
transmit side and it was a bit of a pain to maintain; we ended up in
situations where the guest and the host had to be configured similarly
for things to work, although the non-linear support should solve those
issues now. I'm not against the idea, I just wouldn't wish that pain on
anybody else!

Anyway, if we wanted to support something similar upstream for the rx
buffers, I'd suggest specifying it as a page-order for the entire
SKB allocation and clamping it to PAGE_ALLOC_COSTLY_ORDER.

Will
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help