Re: Let's do P4
From: Jakub Kicinski <hidden>
Date: 2016-10-30 18:44:54
On Sun, 30 Oct 2016 19:01:03 +0100, Jiri Pirko wrote:
Sun, Oct 30, 2016 at 06:45:26PM CET, kubakici@wp.pl wrote:quoted
On Sun, 30 Oct 2016 17:38:36 +0100, Jiri Pirko wrote:quoted
Sun, Oct 30, 2016 at 11:26:49AM CET, tgraf@suug.ch wrote:[...] [...]quoted
quoted
[...] [...] [...] [...][...]quoted
quoted
Agreed.Just to clarify my intention here was not to suggest the use of eBPF as the IR. I was merely cautioning against bundling the new API with P4, for multiple reasons. As John mentioned P4 spec was evolving in the past. The spec is designed for HW more capable than the switch ASICs we have today. As vendors move to provide more configurability we may need to extend the API beyond P4. We may want to extend this API to for SW hand-offs (as suggested by Thomas) which are not part of P4 spec. Also John showed examples of matchd software which already uses P4 at the frontend today and translates it to different targets (eBPF, u32, HW). It may just be about the naming but I feel like calling the new API more generically, switch AST or some such may help to avoid unnecessary ties and confusion.Well, that basically means to create "something" that could be be used to translate p4 source to. Not sure how exactly this "something" should look like and how different would it be from p4. I thought it might be good to benefit from the p4 definition and use it directly. Not sure.
We have to translate the P4 into "something" already, that something is the AST we will load into the kernel. Or were you planning to use some official P4 AST? I'm not suggesting we add our own high level language. I agree that P4 is a good starting point, and perhaps a good high level language. I'm just cautious of creating an equivalency between high level language (P4) and the kernel ABI. Perhaps I'm just wasting everyone's time with this.
quoted
quoted
Exactly. Following drawing shows p4 pipeline setup for SW and Hw: | | +--> ebpf engine | | | | | compilerB | ^ | | p4src --> compilerA --> p4ast --TCNL--> cls_p4 --+-> driver -> compilerC -> HW | userspace | kernel | Now please consider runtime API for rule insertion/removal/stats/etc. Also, the single API is cls_p4 here: | | | | | ebpf map fillup | ^ | | p4 rule --TCNL--> cls_p4 --+-> driver -> HW table fillup | userspace | kernelMy understanding was that the main purpose of SW eBPF translation would be to piggy back on eBPF userspace map API. This seems not to be the case here? Is "P4 rule" being added via some new API? From performancecls_p4 TC classifier.
Oh, so the cls_p4 is just a proxy forwarding the requests to drivers or eBPF backend. Got it. Sorry for being slow. And the requests come down via change() op or something new? I wonder how such scheme compares to eBPF maps performance-wise (updates/sec).
quoted
perspective the SW AST implementation would probably not be any slower than u32, so I don't think we need eBPF for performance. I must be misreading this, if we want eBPF fallback we must extend eBPF with all the map types anyway... so we could just use eBPF map API? I believe John has already done some work in this space (see his GitHub :))I don't think you can use existing BPF maps kernel API. You would still have to have another API just for the offloaded datapath. And that is a bypass. I strongly believe we need a single kernel API for both SW and HW datapath setup and runtime configuration.
Agreed, single API is a must. What is the HW characteristic which doesn't fit with eBPF map API, though? For eBPF offload I was planning on adding offload hooks on eBPF map lookup/update paths and a way of associating the map with a netdev. This should be enough to forward updates to the driver and intercept reads to return the right statistics.