Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames
From: Saeed Mahameed <hidden>
Date: 2019-02-06 00:06:56
On Mon, 2019-02-04 at 19:13 -0800, David Ahern wrote:
On 2/4/19 3:53 AM, Jesper Dangaard Brouer wrote:quoted
On Sat, 2 Feb 2019 14:27:26 -0700 David Ahern [off-list ref] wrote:quoted
On 1/31/19 1:15 PM, Jesper Dangaard Brouer wrote:quoted
quoted
David, Jesper, care to chime in where we ended up in that last thread discussion this?IHMO packets RX and TX on a device need to be accounted, in standard counters, regardless of XDP. For XDP RX the packet is counted as RX, regardless if XDP choose to XDP_DROP. On XDP TX which is via XDP_REDIRECT or XDP_TX, the driver that transmit the packet need to account the packet in a TX counter (this if often delayed to DMA TX completion handling). We cannot break the expectation that RX and TX counter are visible to userspace stats tools. XDP should not make these packets invisible.Agreed. What I was pushing on that last thread was Rx, Tx and dropped are all accounted by the driver in standard stats. Basically if the driver touched it, the driver's counters should indicate that.Sound like we all agree (except with the dropped counter, see below). Do notice that mlx5 driver doesn't do this. It is actually rather confusing to use XDP on mlx5, as when XDP "consume" which include XDP_DROP, XDP_REDIRECT or XDP_TX, then the driver standard stats are not incremented... the packet is invisible to "ifconfig" stat based tools.mlx5 needs some work. As I recall it still has the bug/panic removing xdp programs - at least I don't recall seeing a patch for it.
Only when xdp_redirect to mlx5, and removing the program while redirect is happening, this is actually due to a lack of synchronization means between different drivers, we have some ideas to overcome this using a standard XDP API, or just use a hack in mlx5 driver which i don't like: https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c I will be working on this towards the end of this week.
quoted
quoted
The push back was on dropped packets and whether that counter should be bumped on XDP_DROP.My opinion is the XDP_DROP action should NOT increment the drivers drop counter. First of all the "dropped" counter is also use for other stuff, which will confuse that this counter express. Second, choosing XDP_DROP is a policy choice, it still means it was RX-ed at the driver level.Understood. Hopefully in March I will get some time to come back to this and propose an idea on what I would like to see - namely, the admin has a config option at load time to enable driver counters versus custom map counters. (meaning the operator of the node chooses standard stats over strict performance.) But of course that means the drivers have the code to collect those stats.
So bottom line: 1) Driver will count rx packets as rx-ed packets regardless of XDP decision. 2) Driver should keep track of XDP decisions statistics, report them in ethtool and in the new API suggested by David. track even (XDP_PASS) ? Maybe instead of having all drivers track the statistics on their own, we should move the responsibility to upper layer. Idea: since we already have rxq_info structure per XDP ring (no false sharing) and available per xdp_buff we can do:
+++ b/include/linux/filter.h@@ -651,7 +651,9 @@ static __always_inline u32 bpf_prog_run_xdp(conststruct bpf_prog *prog,
* already takes rcu_read_lock() when fetching the program, so
* it's not necessary here anymore.
*/
- return BPF_PROG_RUN(prog, xdp);
+ u32 ret = BPF_PROG_RUN(prog, xdp);
+ xdp->xdp_rxq_info.stats[ret]++
+ return ret;
}
still we need a way (API) to report the rxq_info to whoever needs to
read current XDP stats
3) Unrelated, In non XDP case, if skb allocation fails or driver fails
to pass the skb up to the stack for somereason, should the driver
increase rx packets ? IMHO the answer should be yes if we want to have
similar behavior between XDP and non XDP cases.
But this could result in netdev->stats.rx_packets + netdev-stats.rx_dropped to be more than the actual rx-ed packets, is this
acceptable ?