Re: UDP implementation and the MSG_MORE flag
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2021-01-26 23:17:35
On Tue, Jan 26, 2021 at 9:58 AM Oliver Graute [off-list ref] wrote:
Hello,
we observe some unexpected behavior in the UDP implementation of the
linux kernel.
Some UDP packets send via the loopback interface are dropped in the
kernel on the receive side when using sendto with the MSG_MORE flag.
Every drop increases the InCsumErrors in /proc/self/net/snmp. Some
example code to reproduce it is appended below.
In the code we tracked it down to this code section. ( Even a little
further but its unclear to me wy the csum() is wrong in the bad case)
udpv6_recvmsg()
...
if (checksum_valid || udp_skb_csum_unnecessary(skb)) {
if (udp_skb_is_linear(skb))
err = copy_linear_skb(skb, copied, off, &msg->msg_iter);
else
err = skb_copy_datagram_msg(skb, off, msg, copied);
} else {
err = skb_copy_and_csum_datagram_msg(skb, off, msg);
if (err == -EINVAL) {
goto csum_copy_err;
}
}
...
Thanks for the report with a full reproducer.
I don't have a full answer yet, but can reproduce this easily.
The third program, without MSG_MORE, builds an skb with
CHECKSUM_PARTIAL in __ip_append_data. When looped to the receive path
that ip_summed means no additional validation is needed. As encoded in
skb_csum_unnecessary.
The first and second programs are essentially the same, bar for a
slight difference in length. In both cases packet length is very short
compared to the loopback device MTU. Because of MSG_MORE, these
packets have CHECKSUM_NONE.
On receive in
__udp4_lib_rcv()
udp4_csum_init()
err = skb_checksum_init_zero_check()
The second program validates and sets ip_summed = CHECKSUM_COMPLETE
and csum_valid = 1.
The first does not, though err == 0.
This appears to succeed consistently for packets <= 68B of payload,
fail consistently otherwise. It is not clear to me yet what causes
this distinction.