Thread (27 messages) 27 messages, 4 authors, 2017-10-18

[PATCH v9 00/20] simplify crypto wait for async op

From: gilad@benyossef.com (Gilad Ben-Yossef)
Date: 2017-10-18 05:44:36
Also in: dm-devel, keyrings, linux-cifs, linux-crypto, linux-fscrypt, linux-mediatek, linux-security-module, lkml, netdev

On Tue, Oct 17, 2017 at 5:06 PM, Russell King - ARM Linux
[off-list ref] wrote:
On Sun, Oct 15, 2017 at 10:19:45AM +0100, Gilad Ben-Yossef wrote:
quoted
Many users of kernel async. crypto services have a pattern of
starting an async. crypto op and than using a completion
to wait for it to end.

This patch set simplifies this common use case in two ways:

First, by separating the return codes of the case where a
request is queued to a backlog due to the provider being
busy (-EBUSY) from the case the request has failed due
to the provider being busy and backlogging is not enabled
(-EAGAIN).

Next, this change is than built on to create a generic API
to wait for a async. crypto operation to complete.

The end result is a smaller code base and an API that is
easier to use and more difficult to get wrong.

The patch set was boot tested on x86_64 and arm64 which
at the very least tests the crypto users via testmgr and
tcrypt but I do note that I do not have access to some
of the HW whose drivers are modified nor do I claim I was
able to test all of the corner cases.

The patch set is based upon linux-next release tagged
next-20171013.
Has there been any performance impact analysis of these changes?  I
ended up with patches for one of the crypto drivers which converted
its interrupt handling to threaded interrupts being reverted because
it caused a performance degredation.

Moving code to latest APIs to simplify it is not always beneficial.
I agree with the sentiment but I believe this one is justified.

This patch set basically does 3 things:

1.  Replace one immediate value (-EBUSY) by another (-EAGAIN). Mostly it's just
s/EBUSY/EAGAIN/g. In very few places this resulted very trivial code
changes. I don't
foresee this having any effect on performance.

2. Removal of some conditions and/or conditional jumps that were used to discern
between two different cases which are now now easily tested for by the
different return
value. If at all, this will be an increase in performance, although I
don't expect it to be
noticeable.

3. Replacing a whole bunch of open coded code and data structures
which were pretty much
cut and pasted from the Documentation and therefore identical, with a
single copy thereof.

Every place that I found that deviated slightly from the identical
pattern, it turned out to be
a bug of some sorts and patches for those were sent and accepted already.

So, we might be losing a few inline optimization opportunities but
we're gaining better
cache utilization. Again, I don't expect any of this to have a
noticeable effect to either
direction.

I did run the changed code as best I could and did not notice any
performance changes and
none of the testers and maintainers that ACKed mentioned any.

Having said that, it's a big change that touches many places,
sub-systems and drivers. I do
not claim to have thoroughly tested for performance all the changes in
person. In some cases,
I don't even have access to the specialized hardware. I did get a
reasonable amount of review
and testers I believe - would always love to see more :-)

Many thanks,
Gilad




-- 
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help