Thread (21 messages) 21 messages, 4 authors, 2014-08-22

Re: Looking for non-NIC hardware-offload for wpa2 decrypt.

From: Christian Lamparter <chunkeey@googlemail.com>
Date: 2014-08-14 12:39:09

On Tuesday, August 12, 2014 11:34:59 AM Ben Greear wrote:
On 08/10/2014 06:44 AM, Christian Lamparter wrote:
quoted
On Thursday, August 07, 2014 10:45:01 AM Ben Greear wrote:
quoted
On 08/07/2014 07:05 AM, Christian Lamparter wrote:
quoted
Or: for every 16 Bytes of payload there is one fpu context save and
restore... ouch!
Any idea if it would work to put the fpu_begin/end a bit higher
and do all those 16 byte chunks in a batch without messing with
the FPU for each chunk?
It sort of works - see sample feature patch for aesni-intel-glue 
(taken from 3.16-wl). Older kernels (like 3.15, 3.14) need:
"crypto: allow blkcipher walks over AEAD data" [0] (and maybe more).

The FPU save/restore overhead should be gone. Also, if the aesni
instructions can't be used, the implementation will fall back
to the original ccm(aes) code. Calculating the MAC is still much
more expensive than the payload encryption or decryption. However,
I can't see a way of making this more efficient without rewriting
and combining the parts I took from crypto/ccm.c into an several, 
dedicated assembler functions.
Without encryption, I see download rate of around 400 - 420Mbps.

So, your patch looks like a good improvement to me, and I'll be
happy to test further patches if you happen to do those assembler
optimizations you talk about above.
Maybe, that will depend on what the results for: "wpa2, *HW*-crypt,
download, udp" are.
Let me know if you would like more/different performance
stats. 
There's a test bench tool (tcrypt) to measure the performance 
of any cipher. It would be interesting to know what the 
performance/throughput it can produce without the overhead
of any application. [Yep, I'm making a small patch to test that,
but not before Saturday next week].
  
Here is perf top of open authentication, download, UDP:

Using WPA2, sw-crypt, download, UDP:

Samples: 52K of event 'cycles', Event count (approx.): 13162827574
 24.78%  btserver              [.] 0x00000000000c598c
Is btserver your "udp download" test application? What does it do, as
it is accounting for nearly 25%?

Regards
Christian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help