Re: [PATCH v5 net-next 01/36] net: Introduce direct data placement tcp offload
From: Boris Pismenny <hidden>
Date: 2021-07-22 13:24:20
Also in:
linux-nvme
On 22/07/2021 14:26, Eric Dumazet wrote:
On Thu, Jul 22, 2021 at 1:04 PM Boris Pismenny [off-list ref] wrote:quoted
From: Boris Pismenny <redacted>...quoted
}; const chardiff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e6ca5a1f3b59..4a7160bba09b 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c@@ -5149,6 +5149,9 @@ tcp_collapse(struct sock *sk, struct sk_buff_head *list, struct rb_root *root, memcpy(nskb->cb, skb->cb, sizeof(skb->cb)); #ifdef CONFIG_TLS_DEVICE nskb->decrypted = skb->decrypted; +#endif +#ifdef CONFIG_ULP_DDP + nskb->ddp_crc = skb->ddp_crc;Probably you do not want to attempt any collapse if skb->ddp_crc is set right there.
Right.
quoted
#endif TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(nskb)->end_seq = start; if (list)@@ -5182,6 +5185,11 @@ tcp_collapse(struct sock *sk, struct sk_buff_head *list, struct rb_root *root, #ifdef CONFIG_TLS_DEVICE if (skb->decrypted != nskb->decrypted) goto end; +#endif +#ifdef CONFIG_ULP_DDP + + if (skb->ddp_crc != nskb->ddp_crc)This checks only the second, third, and remaining skbs.
Right, as we handle the head skb above. Could you clarify?
quoted
+ goto end; #endif } }tcp_collapse() is copying data from small skbs to pack it to bigger skb (one page of payload), in case of memory emergency/pressure (socket queues are full) If your changes are trying to avoid 'needless' copies, maybe you should reconsider and let the emergency packing be done. If the copy is not _possible_, you should rephrase your changelog to clearly state the kernel _cannot_ access this memory in any way.
The issue is that skb_condense also gets called on many skbs in tcp_add_backlog and it will identify skbs that went through DDP as ideal for packing, even though they are not small and packing is counter-productive as data already resides in its destination. As mentioned above, it is possible to copy, but it is counter-productive in this case. If there was a real need to access this memory, then it is allowed.