Thread (18 messages) 18 messages, 3 authors, 2024-06-11

Re: [PATCH v4 6/8] fsverity: improve performance by using multibuffer hashing

From: Ard Biesheuvel <ardb@kernel.org>
Date: 2024-06-11 15:46:14
Also in: dm-devel, linux-arm-kernel, linux-crypto

On Tue, 11 Jun 2024 at 17:21, Herbert Xu [off-list ref] wrote:
On Mon, Jun 10, 2024 at 09:42:58AM -0700, Eric Biggers wrote:
quoted
I understand that you think the ahash based API would make it easier to add
multibuffer support to "authenc(hmac(sha256),cbc(aes))" for IPsec, which seems
to be a very important use case for you (though it isn't relevant to nearly as
many systems as dm-verity and fsverity are).  Regardless, the reality is that it
would be much more difficult to take advantage of multibuffer crypto in the
IPsec authenc use case than in dm-verity and fsverity.  authenc uses multiple
underlying algorithms, AES-CBC and HMAC-SHA256, that would both have to use
multibuffer crypto in order to see a significant benefit, seeing as even if the
SHA-256 support could be wired up through HMAC-SHA256, encryption would be
bottlenecked on AES-CBC, especially on Intel CPUs.  It also looks like the IPsec
code would need a lot of updates to support multibuffer crypto.
The linked-request thing feeds nicely into networking.  In fact
that's where I got the idea of linking them from.  In networking
a large GSO (currently limited to 64K but theoretically we could
make it unlimited) packet is automatically split up into a linked
list of MTU-sized skb's.

Therefore if we switched to a linked-list API networking could
give us the buffers with minimal changes.

BTW, I found an old Intel paper that claims through their multi-
buffer strategy they were able to make AES-CBC-XCBC beat AES-GCM.
I wonder if we could still replicate this today:

https://github.com/intel/intel-ipsec-mb/wiki/doc/fast-multi-buffer-ipsec-implementations-ia-processors-paper.pdf
This looks like the whitepaper that describes the buggy multibuffer
code that we ripped out.
quoted
Ultimately, I need to have dm-verity and fsverity be properly optimized in the
downstreams that are most relevant to me.  If you're not going to allow the
upstream crypto API to provide the needed functionality in a reasonable way,
then I'll need to shift my focus to getting this patchset into downstream
kernels such as Android and Chrome OS instead.
I totally understand that this is your priority.  But please give
me some time to see if we can devise something that works for both
scenarios.
The issue here is that the CPU based multibuffer approach has rather
tight constraints in terms of input length and the shared prefix, and
so designing a more generic API based on ahash doesn't help at all.
The intel multibuffer code went off into the weeds entirely attempting
to apply this parallel scheme to arbitrary combinations of inputs, so
this is something we know we should avoid.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help