Thread (27 messages) 27 messages, 8 authors, 2019-06-21

Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames

From: Saeed Mahameed <hidden>
Date: 2019-02-06 00:06:56

On Mon, 2019-02-04 at 19:13 -0800, David Ahern wrote:
On 2/4/19 3:53 AM, Jesper Dangaard Brouer wrote:
quoted
On Sat, 2 Feb 2019 14:27:26 -0700
David Ahern [off-list ref] wrote:
quoted
On 1/31/19 1:15 PM, Jesper Dangaard Brouer wrote:
quoted
quoted
David, Jesper, care to chime in where we ended up in that
last thread
discussion this?  
IHMO packets RX and TX on a device need to be accounted, in
standard
counters, regardless of XDP.  For XDP RX the packet is counted
as RX,
regardless if XDP choose to XDP_DROP.  On XDP TX which is via
XDP_REDIRECT or XDP_TX, the driver that transmit the packet
need to
account the packet in a TX counter (this if often delayed to
DMA TX
completion handling).  We cannot break the expectation that RX
and TX
counter are visible to userspace stats tools. XDP should not
make these
packets invisible.  
Agreed. What I was pushing on that last thread was Rx, Tx and
dropped
are all accounted by the driver in standard stats. Basically if
the
driver touched it, the driver's counters should indicate that.
Sound like we all agree (except with the dropped counter, see
below).

Do notice that mlx5 driver doesn't do this.  It is actually rather
confusing to use XDP on mlx5, as when XDP "consume" which include
XDP_DROP, XDP_REDIRECT or XDP_TX, then the driver standard stats
are
not incremented... the packet is invisible to "ifconfig" stat based
tools.
mlx5 needs some work. As I recall it still has the bug/panic removing
xdp programs - at least I don't recall seeing a patch for it.
Only when xdp_redirect to mlx5, and removing the program while redirect
is happening, this is actually due to a lack of synchronization means
between different drivers, we have some ideas to overcome this using a
standard XDP API, or just use a hack in mlx5 driver which i don't like:

https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c

I will be working on this towards the end of this week.
quoted
quoted
The push back was on dropped packets and whether that counter
should be
bumped on XDP_DROP.
My opinion is the XDP_DROP action should NOT increment the drivers
drop
counter.  First of all the "dropped" counter is also use for other
stuff, which will confuse that this counter express.  Second,
choosing
XDP_DROP is a policy choice, it still means it was RX-ed at the
driver
level.
Understood. Hopefully in March I will get some time to come back to
this
and propose an idea on what I would like to see - namely, the admin
has
a config option at load time to enable driver counters versus custom
map
counters. (meaning the operator of the node chooses standard stats
over
strict performance.) But of course that means the drivers have the
code
to collect those stats.
So bottom line:
1) Driver will count rx packets as rx-ed packets regardless of XDP
decision.

2) Driver should keep track of XDP decisions statistics, report them in
ethtool and in the new API suggested by David. track even (XDP_PASS) ?

Maybe instead of having all drivers track the statistics on their own,
we should move the responsibility to upper layer.

Idea: since we already have rxq_info structure per XDP ring (no false
sharing) and available per xdp_buff we can do:
+++ b/include/linux/filter.h
@@ -651,7 +651,9 @@ static __always_inline u32 bpf_prog_run_xdp(const
struct bpf_prog *prog,
         * already takes rcu_read_lock() when fetching the program, so
         * it's not necessary here anymore.
         */
-       return BPF_PROG_RUN(prog, xdp);
+       u32 ret = BPF_PROG_RUN(prog, xdp);
+       xdp->xdp_rxq_info.stats[ret]++
+       return ret;
 }

still we need a way (API) to report the rxq_info to whoever needs to
read current XDP stats 

3) Unrelated, In non XDP case, if skb allocation fails or driver fails
to pass the skb up to the stack for somereason, should the driver
increase rx packets ? IMHO the answer should be yes if we want to have
similar behavior between XDP and non XDP cases.

But this could result in netdev->stats.rx_packets + netdev-
stats.rx_dropped to be more than the actual rx-ed packets, is this
acceptable ?




Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help