Re: [PATCH net-next v6 06/15] ethtool: netlink bitset handling
From: Jiri Pirko <jiri@resnulli.us>
Date: 2019-07-03 11:49:39
Also in:
lkml
Tue, Jul 02, 2019 at 01:50:09PM CEST, mkubecek@suse.cz wrote:
quoted hunk ↗ jump to hunk
The ethtool netlink code uses common framework for passing arbitrary length bit sets to allow future extensions. A bitset can be a list (only one bitmap) or can consist of value and mask pair (used e.g. when client want to modify only some bits). A bitset can use one of two formats: verbose (bit by bit) or compact. Verbose format consists of bitset size (number of bits), list flag and an array of bit nests, telling which bits are part of the list or which bits are in the mask and which of them are to be set. In requests, bits can be identified by index (position) or by name. In replies, kernel provides both index and name. Verbose format is suitable for "one shot" applications like standard ethtool command as it avoids the need to either keep bit names (e.g. link modes) in sync with kernel or having to add an extra roundtrip for string set request (e.g. for private flags). Compact format uses one (list) or two (value/mask) arrays of 32-bit words to store the bitmap(s). It is more suitable for long running applications (ethtool in monitor mode or network management daemons) which can retrieve the names once and then pass only compact bitmaps to save space. Userspace requests can use either format and ETHTOOL_RF_COMPACT flag in request header tells kernel which format to use in reply. Notifications always use compact format. Signed-off-by: Michal Kubecek <redacted> --- Documentation/networking/ethtool-netlink.txt | 61 ++ include/uapi/linux/ethtool_netlink.h | 35 ++ net/ethtool/Makefile | 2 +- net/ethtool/bitset.c | 606 +++++++++++++++++++ net/ethtool/bitset.h | 40 ++ net/ethtool/netlink.h | 9 + 6 files changed, 752 insertions(+), 1 deletion(-) create mode 100644 net/ethtool/bitset.c create mode 100644 net/ethtool/bitset.hdiff --git a/Documentation/networking/ethtool-netlink.txt b/Documentation/networking/ethtool-netlink.txt index 97c369aa290b..4636682c551f 100644 --- a/Documentation/networking/ethtool-netlink.txt +++ b/Documentation/networking/ethtool-netlink.txt@@ -73,6 +73,67 @@ set, the behaviour is the same as (or closer to) the behaviour before it wasintroduced. +Bit sets +-------- + +For short bitmaps of (reasonably) fixed length, standard NLA_BITFIELD32 type +is used. For arbitrary length bitmaps, ethtool netlink uses a nested attribute +with contents of one of two forms: compact (two binary bitmaps representing +bit values and mask of affected bits) and bit-by-bit (list of bits identified +by either index or name). + +Compact form: nested (bitset) atrribute contents: + + ETHTOOL_A_BITSET_LIST (flag) no mask, only a list + ETHTOOL_A_BITSET_SIZE (u32) number of significant bits + ETHTOOL_A_BITSET_VALUE (binary) bitmap of bit values + ETHTOOL_A_BITSET_MASK (binary) bitmap of valid bits + +Value and mask must have length at least ETHTOOL_A_BITSET_SIZE bits rounded up +to a multiple of 32 bits. They consist of 32-bit words in host byte order,
Looks like the blocks are similar to NLA_BITFIELD32. Why don't you user nested array of NLA_BITFIELD32 instead?
+words ordered from least significant to most significant (i.e. the same way as +bitmaps are passed with ioctl interface). + +For compact form, ETHTOOL_A_BITSET_SIZE and ETHTOOL_A_BITSET_VALUE are +mandatory. Similar to BITFIELD32, a compact form bit set requests to set bits
Double space^^
+in the mask to 1 (if the bit is set in value) or 0 (if not) and preserve the +rest. If ETHTOOL_A_BITSET_LIST is present, there is no mask and bitset +represents a simple list of bits.
Okay, that is a bit confusing. Why not to rename to something like: ETHTOOL_A_BITSET_NO_MASK (flag) ?
+ +Kernel bit set length may differ from userspace length if older application is +used on newer kernel or vice versa. If userspace bitmap is longer, an error is +issued only if the request actually tries to set values of some bits not +recognized by kernel. + +Bit-by-bit form: nested (bitset) attribute contents: + + ETHTOOL_A_BITSET_LIST (flag) no mask, only a list + ETHTOOL_A_BITSET_SIZE (u32) number of significant bits + ETHTOOL_A_BITSET_BIT (nested) array of bits + ETHTOOL_A_BITSET_BIT+ (nested) one bit + ETHTOOL_A_BIT_INDEX (u32) bit index (0 for LSB) + ETHTOOL_A_BIT_NAME (string) bit name + ETHTOOL_A_BIT_VALUE (flag) present if bit is set + +Bit size is optional for bit-by-bit form. ETHTOOL_A_BITSET_BITS nest can only +contain ETHTOOL_A_BITS_BIT attributes but there can be an arbitrary number of +them. A bit may be identified by its index or by its name. When used in +requests, listed bits are set to 0 or 1 according to ETHTOOL_A_BIT_VALUE, the +rest is preserved. A request fails if index exceeds kernel bit length or if +name is not recognized. + +When ETHTOOL_A_BITSET_LIST flag is present, bitset is interpreted as a simple +bit list. ETHTOOL_A_BIT_VALUE attributes are not used in such case. Bit list +represents a bitmap with listed bits set and the rest zero. + +In requests, application can use either form. Form used by kernel in reply is +determined by a flag in flags field of request header. Semantics of value and +mask depends on the attribute. General idea is that flags control request +processing, info_mask control which parts of the information are returned in +"get" request and index identifies a particular subcommand or an object to +which the request applies.
This is quite complex and confusing. Having the same API for 2 APIs is
odd. The API should be crystal clear, easy to use.
Why can't you have 2 commands, one working with bit arrays only, one
working with strings? Something like:
X_GET
ETHTOOL_A_BITS (nested)
ETHTOOL_A_BIT_ARRAY (BITFIELD32)
X_NAMES_GET
ETHTOOL_A_BIT_NAMES (nested)
ETHTOOL_A_BIT_INDEX
ETHTOOL_A_BIT_NAME
For set, you can also have multiple cmds:
X_SET - to set many at once, by bit index
ETHTOOL_A_BITS (nested)
ETHTOOL_A_BIT_ARRAY (BITFIELD32)
X_ONE_SET - to set one, by bit index
ETHTOOL_A_BIT_INDEX
ETHTOOL_A_BIT_VALUE
X_ONE_SET - to set one, by name
ETHTOOL_A_BIT_NAME
ETHTOOL_A_BIT_VALUE
[...]