Thread (10 messages) 10 messages, 4 authors, 2018-11-30

Re: Invalid transport_offset with AF_PACKET socket

From: Saeed Mahameed <hidden>
Date: 2018-11-30 12:10:49

On Wed, Nov 28, 2018 at 3:10 AM Maxim Mikityanskiy [off-list ref] wrote:
Hi Saeed,
quoted
Can you elaborate more, what NIC? what configuration ? what do you mean
by confusion, anyway please see below
ConnectX-4, after running `mlnx_qos -i eth1 --trust dscp`, which sets inline
mode 2 (MLX5_INLINE_MODE_IP). I'll explain what I mean by confusion below.
quoted
in mlx5 with ConnectX4 or Connext4-LX there is a requirement to copy at
least the ethernet header to the tx descriptor otherwise this might
cause the packet to be dropped, and for RAW sockets the skb headers
offsets are not set, but the latest mlx5 upstream driver would know how
to handle this, and copy the minmum amount required
please see:

static inline u16 mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
                                      struct sk_buff *skb)
Yes, I know that, and what I do is debugging an issue with this function.
quoted
it should default to:


case MLX5_INLINE_MODE_L2:
      default:
              hlen = mlx5e_skb_l2_header_offset(skb);
The issue appears in MLX5_INLINE_MODE_IP. I haven't tested
MLX5_INLINE_MODE_TCP_UDP yet, though.
quoted
So it should return at least 18 and not 14.
Yes, the function does its best to return at least 18, but it silently expects
skb_transport_offset to exceed 18. In normal conditions, it will be more that
18, because it will be at least 14 + 20. But in my case, when I send a packet
via an AF_PACKET socket, skb_transport_offset returns 14 (which is nonsense),
and the driver uses this value, causing the hardware to fail, because it's less
than 18.
Got it, so even if you copy 18 it is not sufficient ! if the packet is
ipv4 or ipv6
and the inline mode is set to  MLX5_INLINE_MODE_IP in the vport
context you must copy the IP headers as well !

but what do you expect from AF_PACKET socket ? to parse each and every
packet and set skb_transport_offset ?
quoted
We had some issues with this in old driver such as kernels 4.14/15, and
it depends in the use case so i need some information first:
No, it's not an old kernel. We actually have this bug in our internal bug
tracking system, and I'm trying to resolve it.
quoted
1. What Cards do you have ? (lspci)
03:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
03:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
81:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

Testing with ConnectX-4.
quoted
2. What kernel/driver version are you using ?
I'm on net-next-mlx5, commit 66a4b5ef638a (the latest when I started the
investigation).
quoted
3. what is the current enum mlx5_inline_modes seen in
mlx5e_calc_min_inline or sq->min_inline_mode ?
MLX5_INLINE_MODE_IP, as I said above.
quoted
4. Firmware version ? (ethtool -i)
12.22.0238 (MT_2190110032)
quoted
can you share the packet format you are sending and seeing the bad
behavior with
Here is the hexdump of the simplest packet that causes the problem when it's
sent through AF_PACKET after `mlnx_qos -i eth1 --trust dscp`:

00000000: 11 22 33 44 55 66 77 88 99 aa bb cc 08 00 45 00
00000010: 00 20 00 00 40 00 40 11 ae a5 c6 12 00 01 c6 12
00000020: 00 02 00 00 4a 38 00 0c 29 82 61 62 63 64

(Please ignore the wrong UDP checksum and non-existing MACs, it doesn't matter
at all, I tested it with completely valid packets as well. The wrong UDP
checksum is due to a bug in our internal pypacket utility).

Thanks,
Max
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help