Re: [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats
From: Toke Høiland-Jørgensen <hidden>
Date: 2021-11-30 16:18:14
Also in:
bpf, linux-doc, lkml, netdev, virtualization
Alexander Lobakin [off-list ref] writes:
From: Alexander Lobakin <redacted> Date: Tue, 23 Nov 2021 17:39:29 +0100 Ok, open questions: 1. Channels vs queues vs global. Jakub: no per-channel. David (Ahern): it's worth it to separate as Rx/Tx. Toke is fine with globals at the end I think?
Well, I don't like throwing data away, so in that sense I do like per-queue stats, but it's not a very strong preference (i.e., I can live with either)...
My point was that for most of the systems we have 1:1 Rx:Tx (usually num_online_cpus()), so asking drivers separately for the number of RQs and then SQs would end up asking for the same number twice. But the main reason TBH was that most of the drivers store stats on a per-channel basis and I didn't want them to regress in functionality. I'm fine with reporting only netdev-wide if everyone are. In case if we keep per-channel: report per-channel only by request and cumulative globals by default to not flood the output?
... however if we do go with per-channel stats I do agree that they shouldn't be in the default output. I guess netlink could still split them out and iproute2 could just sum them before display?
2. Count all errors as "drops" vs separately.
Daniel: account everything as drops, plus errors should be
reported as exceptions for tracing sub.
Jesper: we shouldn't mix drops and errors.
My point: we shouldn't, that's why there are patches for 2 drivers
to give errors a separate counter.
I provided an option either to report all errors together ('errors'
in stats structure) or to provide individual counters for each of
them (sonamed ctrs), but personally prefer detailed errors. However,
they might "go detailed" under trace_xdp_exception() only, sound
fine (OTOH in RTNL stats we have both "general" errors and detailed
error counters).I agree it would be nice to have a separate error counter, but a single counter is enough when combined with the tracepoints.
3. XDP and XSK ctrs separately or not.
My PoV is that those are two quite different worlds.
However, stats for actions on XSK really make a little sense since
99% of time we have xskmap redirect. So I think it'd be fine to just
expand stats structure with xsk_{rx,tx}_{packets,bytes} and count
the rest (actions, errors) together with XDP.A whole set of separate counters for XSK is certainly overkill. No strong preference as to whether they need a separate counter at all...
Rest: - don't create a separate `ip` command and report under `-s`; - save some RTNL skb space by skipping zeroed counters. Also, regarding that I count all on the stack and then add to the storage once in a polling cycle -- most drivers don't do that and just increment the values in the storage directly, but this can be less performant for frequently updated stats (or it's just my embedded past). Re u64 vs u64_stats_t -- the latter is more universal and architecture-friendly, the former is used directly in most of the drivers primarily because those drivers and the corresponding HW are being run on 64-bit systems in the vast majority of cases, and Ethtools stats themselves are not so critical to guard them with anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess?
I'm generally a fan of correctness first, so since you're touching all the drivers anyway why I'd say go for u64_stats_t :) -Toke