Thread (15 messages) 15 messages, 4 authors, 2015-10-05

Re: [RFC PATCH] Fix false positives in can_checksum_protocol()

From: Tom Herbert <hidden>
Date: 2015-09-28 17:03:15

On Fri, Sep 25, 2015 at 5:55 AM, David Woodhouse [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The check in harmonize_features() is supposed to match the skb against
the features of the device in question, and prevent us from handing a
skb to a device which can't handle it.

It doesn't work correctly. A device with NETIF_F_IP_CSUM or
NETIF_F_IPV6_CSUM capabilities is only required to checksum TCP or UDP,
on Legacy IP and IPv6 respectively. But the existing check will allow
us to pass it *any* ETH_P_IP/ETH_P_IPV6 packets for hardware checksum
offload.

Depending on the driver in use, this leads to a BUG, a WARNING, or just
silent data corruption.

This is one approach to fixing that, and my test program at
http://bombadil.infradead.org/~dwmw2/raw.c can no longer trivially
reproduce the problem.

The test does now have false *negatives*, but those shouldn't happen
for locally-generated packets; only packets injected from af_packet,
tun, virtio_net and other places that allow us to inject
CHECKSUM_PARTIAL packets in order to make use of hardware offload
features. And false negatives aren't anywhere near as much of a problem
as false positives are — we just finish the checksum in software and
send the packet anyway.

It would be possible to fix those false negatives, if we really wanted
to — perhaps by adding an additional bit in the skbuff which indicates
that it *is* a TCP or UDP packet, rather than using ->sk->sk_protocol.
Then that bit could be set if appropriate in skb_partial_csum_set(), as
well as the places where we locally generate such packets. And the
check in can_checksum_protocol() would just check for that bit.

Signed-off-by: David Woodhouse <redacted>
---
Since can_checksum_protocol is inline, the compiler ought to know that
it doesn't even need to dereference skb->sk in the case where the
device has the NETIF_F_GEN_CSUM feature. So the additional check should
not slow down the (hopefully) common case in the fast path.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2d15e38..76c8330 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3628,15 +3628,23 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, netdev_features_t features)
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth);

 static inline bool can_checksum_protocol(netdev_features_t features,
-                                        __be16 protocol)
-{
-       return ((features & NETIF_F_GEN_CSUM) ||
-               ((features & NETIF_F_V4_CSUM) &&
-                protocol == htons(ETH_P_IP)) ||
-               ((features & NETIF_F_V6_CSUM) &&
-                protocol == htons(ETH_P_IPV6)) ||
-               ((features & NETIF_F_FCOE_CRC) &&
-                protocol == htons(ETH_P_FCOE)));
+                                        __be16 protocol, u8 sk_protocol)
+{
+       if ((features & NETIF_F_GEN_CSUM) ||
+           ((features & NETIF_F_FCOE_CRC) && protocol == htons(ETH_P_FCOE)))
+               return 1;
+
+       /* NETIF_F_V[46]_CSUM are defined to work only on TCP and UDP.
+        * That is, when it needs to start checksumming at the transport
+        * header, and place the result at an offset of either 6 (for UDP)
+        * or 16 (for TCP).
+        */
+       if ((((features & NETIF_F_V4_CSUM) && protocol == htons(ETH_P_IP)) ||
+            ((features & NETIF_F_V6_CSUM) && protocol == htons(ETH_P_IPV6))) &&
+           (sk_protocol == IPPROTO_TCP || sk_protocol == IPPROTO_UDP))
+               return 1;
+
Relying on skb->sk->sk_protocol is problematic. This is making the
assumption that the checksum being offloaded for the packet is the
same as that of the protocol for the socket-- this may not be the case
when we are offloading an outer checksum in encapsulation. Currently
this wouldn't a be problem since we're probably only offloading outer
UDP checksums, but if we ever start trying to offload other outer
checksums (e.g. GRE) then this probably doesn't work so well. Also,
this doesn't help those drivers that that can offload TCP and UDP for
IPv6 but only if there are no extension headers, in those case the
driver needs to look at the packet to see if it is a "simple" UDP/TCP
packet.

AFAIK, the only non UDP/TCP transport IP checksum in the stack is GRE
checksum which as I pointed out we don't attempt to offload. So the
only way to trip the bug that you are seeing is probably through a
userspace packet interface like in the test code. I think this
actually might expose a much more serious issue. Looking at tun.c, I
don't see anything that validates that the csum_start and csum_offset
provided by userspace actually refers to a sane checksum offset. Not
only is this a way to ask the stack to perform checksums for non
TCP/UDP, but it actually seems like the interface could be used by a
malicious application to have a device arbitrarily overwrite two bytes
anywhere in the packet with it's own data far below the stack,
netfilter, routing. To really fix this we should probably be doing
validation in tun, if the checksum isn't for TCP or UDP then call
skb_checksum_help before sending the packet into the stack.

Tom
quoted hunk ↗ jump to hunk
+       return 0;
 }

 #ifdef CONFIG_BUG
diff --git a/net/core/dev.c b/net/core/dev.c
index 6bb6470..3c40957 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2613,7 +2613,8 @@ static netdev_features_t harmonize_features(struct sk_buff *skb,
        features = net_mpls_features(skb, features, type);

        if (skb->ip_summed != CHECKSUM_NONE &&
-           !can_checksum_protocol(features, type)) {
+           !can_checksum_protocol(features, type,
+                                  skb->sk ? skb->sk->sk_protocol : 0)) {
                features &= ~NETIF_F_ALL_CSUM;
        } else if (illegal_highdma(skb->dev, skb)) {
                features &= ~NETIF_F_SG;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index dad4dd3..9126c6f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3004,7 +3004,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
                return ERR_PTR(-EINVAL);

        csum = !head_skb->encap_hdr_csum &&
-           !!can_checksum_protocol(features, proto);
+               !!can_checksum_protocol(features, proto,
+                               head_skb->sk ? head_skb->sk->sk_protocol : 0);

        headroom = skb_headroom(head_skb);
        pos = skb_headlen(head_skb);
--
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help