Thread (11 messages) 11 messages, 5 authors, 2019-02-06

Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames

From: Eric Dumazet <edumazet@google.com>
Date: 2019-01-31 19:08:01

On Thu, Jan 31, 2019 at 11:04 AM Saeed Mahameed [off-list ref] wrote:
On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
quoted
On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan [off-list ref]
wrote:
quoted
From: Saeed Mahameed <redacted>

When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't
affect
checksum. However, it is not guaranteed. For example, switches
might
choose to make other use of these octets.
This repeatedly causes kernel hardware checksum fault.

Prior to the cited commit below, skb checksum was forced to be
CHECKSUM_NONE when padding is detected. After it, we need to keep
skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
verify and parse IP headers, it does not worth the effort as the
packets
are so small that CHECKSUM_COMPLETE has no significant advantage.

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Saeed Mahameed <redacted>
Signed-off-by: Tariq Toukan <redacted>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++-
-
 1 file changed, 17 insertions(+), 2 deletions(-)

Hi Dave,
Please queue for -stable >= v4.18.

Thanks,
Tariq
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 9a0881cb7f51..fc8a11444e1a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
hw_checksum, struct sk_buff *skb,
 }
 #endif

+#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
+
 /* We reach this function only after checking that any of
  * the (IPv4 | IPv6) bits are set in cqe->status.
  */
@@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe,
struct sk_buff *skb, void *va,
                      netdev_features_t dev_features)
 {
        __wsum hw_checksum = 0;
+       void *hdr;
+
+       /* CQE csum doesn't cover padding octets in short ethernet
+        * frames. And the pad field is appended prior to
calculating
+        * and appending the FCS field.
+        *
+        * Detecting these padded frames requires to verify and
parse
+        * IP headers, so we simply force all those small frames to
be
+        * CHECKSUM_NONE even if they are not padded.
+        * TODO: better if we report CHECKSUM_UNNECESSARY but this
+        * demands some refactroing.
This TODO: looks bogus to me.

mlx4 driver first tries to use CHECKSUM_UNNECESSARY.

if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
                                                  MLX4_CQE_STATUS_UDP
)) &&
     (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
      cqe->checksum == cpu_to_be16(0xffff)) {
     ...
      ip_summed = CHECKSUM_UNNECESSARY;
      ...
}

Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
calls this check_sum() function)

So there is no refactoring to be done for mlx4 : short
IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already
not exactly, considering mlx4 weird condition to decide to do
CHECKSUM_UNNECESSARY:

if ((cqe csum fields are valid) && cqe->checksum ==
cpu_to_be16(0xffff))
     { do CHECKSUM_UNNECESSARY }
else { do CHECKSUM_COMPLETE }

CHECKSUM_UNNECESSARY will be skipped if csum complete field is valid
i.e (cqe->checksum != 0xffff)

but if checksum complete is to be skipped due to short_frame we want to
go back to CHECKSUM_UNNECESSARY, this is the refactoring i am talking
about :).


I hope now it is clear..
Not really, since in case of short frame, cqe->checksum will be
0xffff, since mlx4 does not include the padding bytes in the checksum
in the first place.

(For native IPv4/IPV6 TCP/UDP frames that is)
quoted
quoted
+        */
+       if (short_frame(skb->len))
+               return -EINVAL;

-       void *hdr = (u8 *)va + sizeof(struct ethhdr);
-
+       hdr = (u8 *)va + sizeof(struct ethhdr);
        hw_checksum = csum_unfold((__force __sum16)cqe->checksum);

        if (cqe->vlan_my_qpn &
cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
--
1.8.3.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help