Thread (22 messages) 22 messages, 4 authors, 2024-10-24

RE: [PATCH net-next 3/8] lib: packing: add pack_fields() and unpack_fields()

From: "Keller, Jacob E" <jacob.e.keller@intel.com>
Date: 2024-10-16 22:31:33
Also in: lkml

-----Original Message-----
From: Vladimir Oltean <olteanv@gmail.com>
Sent: Wednesday, October 16, 2024 6:41 AM
To: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>
Cc: Keller, Jacob E <jacob.e.keller@intel.com>; linux-kernel@vger.kernel.org;
Andrew Morton [off-list ref]; Eric Dumazet
[off-list ref]; Jakub Kicinski [off-list ref]; Paolo Abeni
[off-list ref]; Nguyen, Anthony L [off-list ref];
netdev@vger.kernel.org; Vladimir Oltean [off-list ref]
Subject: Re: [PATCH net-next 3/8] lib: packing: add pack_fields() and
unpack_fields()

On Wed, Oct 16, 2024 at 03:02:38PM +0200, Przemek Kitszel wrote:
quoted
On 10/11/24 20:48, Jacob Keller wrote:
quoted
From: Vladimir Oltean <vladimir.oltean@nxp.com>

This is new API which caters to the following requirements:

- Pack or unpack a large number of fields to/from a buffer with a small
   code footprint. The current alternative is to open-code a large number
   of calls to pack() and unpack(), or to use packing() to reduce that
   number to half. But packing() is not const-correct.

- Use unpacked numbers stored in variables smaller than u64. This
   reduces the rodata footprint of the stored field arrays.

- Perform error checking at compile time, rather than at runtime, and
   return void from the API functions. To that end, we introduce
   CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
   fields. Note: the C preprocessor can't generate variable-length code
   (loops),  as would be required for array-style definitions of struct
   packed_field arrays. So the sanity checks use code generation at
   compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
   There are explicit macros for sanity-checking arrays of 1 packed
   field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
   fields. In practice, the sja1105 driver will actually need the variant
   with 40 fields. This isn't as bad as it seems: feeding a 39 entry
   sized array into the CHECK_PACKED_FIELDS_40() macro will actually
   generate a compilation error, so mistakes are very likely to be caught
   by the developer and thus are not a problem.

- Reduced rodata footprint for the storage of the packed field arrays.
   To that end, we have struct packed_field_s (small) and packed_field_m
   (medium). More can be added as needed (unlikely for now). On these
   types, the same generic pack_fields() and unpack_fields() API can be
   used, thanks to the new C11 _Generic() selection feature, which can
   call pack_fields_s() or pack_fields_m(), depending on the type of the
   "fields" array - a simplistic form of polymorphism. It is evaluated at
   compile time which function will actually be called.

Over time, packing() is expected to be completely replaced either with
pack() or with pack_fields().

Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
  include/linux/packing.h  |  69 ++++++++++++++++++++++
  lib/gen_packing_checks.c |  31 ++++++++++
  lib/packing.c            | 149
++++++++++++++++++++++++++++++++++++++++++++++-
quoted
quoted
  Kbuild                   |  13 ++++-
  4 files changed, 259 insertions(+), 3 deletions(-)
quoted
diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
new file mode 100644
index 000000000000..3213c858c2fe
--- /dev/null
+++ b/lib/gen_packing_checks.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+
+int main(int argc, char **argv)
+{
+	printf("/* Automatically generated - do not edit */\n\n");
+	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
+	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
+
+	for (int i = 1; i <= 50; i++) {
either you missed my question, or I have missed your reply during
internal round of review, but:

do we need 50? that means 1MB file, while it is 10x smaller for n=27
That is partly why we generate the file instead of committing it. We could reduce this to 40, (or make it 40 once we add the sja1105 driver).

This would somewhat limit the size, at least until/unless another place in the code adds more fields to an array.
The sja1105 driver will need checks for arrays of 40 fields, it's in the
commit message. Though if the file size is going to generate comments
even at this initial dimension, then maybe it isn't the best way forward...

Suggestions for how to scale up the compile-time checks?

Originally the CHECK_PACKED_FIELD_OVERLAP() weren't the cartesian
product of all array elements. I just assumed that the field array would
be ordered from MSB to LSB. But then, Jacob wondered why the order isn't
from LSB to MSB. The presence/absence of the QUIRK_LSW32_IS_FIRST quirk
seems to influence the perception of which field layout is natural.
So the full-blown current overlap check is the compromise to use the
same sanity check macros everywhere. Otherwise, we'd have to introduce
CHECK_PACKED_FIELDS_5_UP() and CHECK_PACKED_FIELDS_5_DOWN(), and
although even that would be smaller in size than the full cartesian
product, it's harder to use IMO.
Another option would be to implement something external to C to validate the fields, perhaps something in sparse? Downside being that it is less likely to be checked, so more risk that bugs creep in.
quoted
quoted
+		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n",
i);
quoted
quoted
+		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len =
(pbuflen); \\\n");
quoted
quoted
+		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
+		for (int j = 0; j < i; j++) {
+			int final = (i == 1);
you could replace both @final variables and ternary operators from
the prints by simply moving the final "})\n" outside the loops
I don't see how, could you illustrate with some code? (assuming you're
not proposing to change the generated output?) Even if you move the
final "})\n" outside the loop, I'm not seeing how you would avoid
printing the last " \\" without keeping track of the "final" variable
anyway, point at which you're better off with the ternary than yet
another printf() call?
quoted
quoted
+
+			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
+			       j, final ? " })\n" : " \\");
+		}
+		for (int j = 1; j < i; j++) {
+			for (int k = 0; k < j; k++) {
+				int final = (j == i - 1) && (k == j - 1);
+
+				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d],
_f[%d]);%s\n",
quoted
quoted
+				       k, j, final ? " })\n" : " \\");
+			}
+		}
+	}
+
+	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
+}
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help