Re: [PATCH v4 4/9] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page
From: Will Deacon <will@kernel.org>
Date: 2026-01-12 15:23:10
Also in:
lkml, virtualization
On Mon, Jan 12, 2026 at 03:48:11PM +0100, Stefano Garzarella wrote:
On Thu, Jan 08, 2026 at 05:33:42PM +0100, David Woodhouse wrote:quoted
On Thu, 2025-07-17 at 10:01 +0100, Will Deacon wrote:quoted
-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4) +/* Dimension the RX SKB so that the entire thing fits exactly into + * a single 4KiB page. This avoids wasting memory due to alloc_skb() + * rounding up to the next page order and also means that we + * don't leave higher-order pages sitting around in the RX queue. + */ +#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE SKB_WITH_OVERHEAD(1024 * 4)Should this be SKB_WITH_OVERHEAD()?ehm, is what the patch is doing, no?quoted
Or should it subtract VIRTIO_VSOCK_SKB_HEADROOM instead?Why? IIRC the goal of the patch was to have an SKB that fit entirely on one page, to avoid wasting memory, so yes, we are reducing the payload a little bit (4K vs 4K - VIRTIO_VSOCK_SKB_HEADROOM - SKB_OVERHEAD), but we are also reducing segmentation.quoted
(And also, I have use cases where I want to expand this to 64KiB. Can I make it controllable with a sockopt? module param?)
What page size are you using? At some point I had this as PAGE_SIZE but it wasn't popular: https://lore.kernel.org/all/20250701201400.52442b0e@pumpkin/ (local)
I'm not sure about sockopt, because this is really device specific and can't be linked to a specific socket, since the device will pre-fill the queue with buffers that can be assigned to different sockets. But yeah, perhaps a module parameter would suffice, provided that it can only be modified at load time, otherwise we would have to do something similar to NIC and ethtool, but I feel that would be too complicated for this use case.
FWIW, we carried something similar in Android for a while on the transmit side and it was a bit of a pain to maintain; we ended up in situations where the guest and the host had to be configured similarly for things to work, although the non-linear support should solve those issues now. I'm not against the idea, I just wouldn't wish that pain on anybody else! Anyway, if we wanted to support something similar upstream for the rx buffers, I'd suggest specifying it as a page-order for the entire SKB allocation and clamping it to PAGE_ALLOC_COSTLY_ORDER. Will