Re: [PATCH net-next] docs: netlink: clarify the historical baggage of Netlink flags
From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: 2022-10-02 13:59:56
On Fri, Sep 30, 2022 at 2:19 PM Nikolay Aleksandrov [off-list ref] wrote:
On 30/09/2022 19:36, Jamal Hadi Salim wrote:quoted
On Fri, Sep 30, 2022 at 10:34 AM Nikolay Aleksandrov [off-list ref] wrote:
[..]
quoted
You only have one object type though per netlink request i.e you dont have in the same message fdb and mdb objects?Yep, it is object-type and family- specific, as is the call itself.
Ok, so that makes it easier. [..]
quoted
Isnt it sufficient to indicate what objects need to be deleted based on presence of TLVs or the service header for that object?That was my initial proposal for the fdbs. :) When flush attribute was present it would act on it (and filter based on embedded filters). The only non-intuitive part was that it happened through SETLINK (changelink), which is a bit strange for a delete op.quoted
quoted
quoted
Really NLM_F_ROOT and _MATCH are sufficient. The filtering expression is the challenge.NLM_F_ROOT isn't usable for a DEL expression because its bit is already used by NLM_F_NONREC and it wouldn't be nice to change meaning of the bit based on the subsystem. NLM_F_MATCH's bit actually matches NLM_F_BULK :)Ouch. Ok, it got messy over time i guess. We probably should have spent more time discussing NLM_F_NONREC since it has a single user with very specific need and it got imposed on all. I get your point - i am still not sure if a global flag is the right answer.Personally, I prefer the complete netlink approach (tlvs describing the operation and filters). In the end the flag was close enough, I kept all of the family specific code the same just the entry point was different and other families could use it as a modifier to their del commands.
BTW, it seems that nftables is an outlier. You should still be able to use NLM_F_ROOT acronmy for DELETE. act_api uses NLM_F_ROOT on delete to flush the whole table of actions. My git-archealogy-foo says since 2005. NLM_F_NONREC was added in 2017. So you really should just be able to use NLM_F_ROOT to check for Delete of the whole table and TLV specific to service to filter further.
quoted
quoted
Sometime back I played with a different idea - expressing the filters with the existing TLV objects so whatever can be specified by user-space can also be used as a filter (also for filtering dump requests) with some introspection. The lua idea sounds nice though.So what is the content of the TLV in that case?My first approach, which wasn't using bpf, used the tlv type to define specific filters on the various types, incl. binary (which at the time was only an exact match, could be improved though). BPF w/ btf would be the obvious choice these days.
The filter TLVs are good because the rest of the world can use them. The challenge is experessability. Like you say above, exact match is easy; inet diag has its own DSL to describe things which could be easily extended. A solution like a Lua script is second best and of course not to rule out ebpf - but that requires more skills.
quoted
I think ebpf may work with some acrobatics. We did try classical ebpf and it was messy. Note for scaling, this is not just about Delete and Get but also for generated events, where one can send to the kernel a filter so they dont see a broadcastYeah, I remember CL having scaling issues in some user-space software that was snooping netlink messages and that's the reason I looked into filtering at that time.
They are related problems. When you have 1000s of potential events it just doesnt scale. My idea was to specify a filter to select a subset and then open multiple sockets each specifying a different filter subset. cheers, jamal