Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver
From: Ard Biesheuvel <ardb@kernel.org>
Date: 2021-01-18 12:12:14
Also in:
linux-devicetree
On Mon, 18 Jan 2021 at 12:55, Reshetova, Elena [off-list ref] wrote:
quoted
On Thu, 14 Jan 2021 at 11:25, Reshetova, Elena [off-list ref] wrote:quoted
quoted
quoted
On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:quoted
quoted
2. The OCS ECC HW does not support the NIST P-192 curve. We wereplanningquoted
quoted
toquoted
quoted
quoted
add SW fallback for P-192 in the driver, but the Intel Crypto team (which, internally, has to approve any code involving cryptography) advised against it, because they consider P-192 weak. As a result, the driver is not passing crypto self-tests. Is there any possible solution to this? Is it reasonable to change the self-tests to only test the curves actually supported by the tested driver? (not fully sure how to do that).An additional reason against the P-192 SW fallback is the fact that it can potentially trigger unsafe behavior which is not even "visible" to the end user of the ECC functionality. If I request (by my developer mistake) a P-192 weaker curve from ECC Keem Bay HW driver, it is much safer to return a "not supported" error that proceed behind my back with a SW code implementation making me believe that I am actually getting a HW-backed up functionality (since I don't think there is a way for me to check that I am using SW fallback).Sorry, but if you break the Crypto API requirement then your driver isn't getting merged.But should not we think what behavior would make sense for good crypto driversinquoted
quoted
future? As cryptography moves forward (especially for the post quantum era), we willhavequoted
quoted
lengths for all existing algorithms increased (in addition to having a bunch of new ones), and we surely should not expect the new generation of HW drivers to implement the old/weaker lengths, so why there the requirement to support them? It is notaquoted
quoted
part of crypto API definition on what bit lengths should be supported, because it cannot be part of API to begin with since it is always changing parameter(algorithmsquoted
quoted
and attacks develop all the time).I would really appreciate, if someone helps us to understand here. Maybe there is a correct way to address this, but we just don't see it. The question is not even about this particular crypto driver and the fact whenever it gests merged or not, but the logic of the crypto API subsystem. As far as I understand the implementations that are provided by the specializeddriversquoted
(like our Keem Bay OCS ECC driver example here) have a higher priority vs. generic Implementations that exists in kernel, which makes sense because we expect thesedriversquoted
(and the security HW they talk to) to provide both more efficient and more secure implementations than a pure SW implementation in kernel can do (even if it utilizesspecialquoted
instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound by what security HW can do, and if it does not support a certain size/param of thealgorithmquoted
(P-192 curve in our case), it is pointless and wrong for them to reimplement whatSW isquoted
already doing in kernel, so they should not do so and currently they re-direct tocore kernelquoted
implementation. So far good. But now comes my biggest worry is that this redirection as far as I can see is *internal to driver itself*, i.e. it does a callback to these corefunctions from the driverquoted
code, which again, unless I misunderstand smth, leads to the fact that the end usergetsquoted
P-192 curve ECC implementation from the core kernel that has been "promoted"to a highestquoted
priority (given that ECC KeemBay driver for example got priority 300 to begin with).So, ifquoted
we say we have another HW Driver 'Foo', which happens to implement P-192curves more securely,quoted
but happens to have a lower priority than ECC KeemBay driver, its implementationwould neverquoted
be chosen, but core kernel implementation will be used (via SW fallback internal toECC Keemquoted
Bay driver).No, this is incorrect. If you allocate a fallback algorithm in the correct way, the crypto API will resolve the allocation in the usual manner, and select whichever of the remaining implementations has the highest priority (provided that it does not require a fallback itself).Thank you very much Ard for the important correction here! See below if I got it now correctly to the end for the use case in question.quoted
quoted
Another problem is that for a user of crypto API I don't see a way (and perhaps Iam wrong here)quoted
to guarantee that all my calls to perform crypto operations will end up beingperformed on aquoted
security HW I want (maybe because this is the only thing I trust). It seems to bepossible in theory,quoted
but in practice would require careful evaluation of a kernel setup and a syncbetween whatquoted
end user requests and what driver can provide. Let me try to explain a potentialscenario.quoted
Lets say we had an end user that used to ask for both P-192 and P-384 curve-basedECC operationsquoted
and let's say we had a driver and security HW that implemented it. The end usermade sure thatquoted
this driver implementation is always preferred vs. other existing implementations.Now, time moves, a newquoted
security HW comes instead that only supports P-384, and the driver now has beenupdated toquoted
support P-192 via the SW fallback (like we are asked now). Now, how does an end user notice that when it asks for a P-192 based operations,his operationsquoted
are not done in security HW anymore? The only way seems to be is to know that driver and security HW has been updated, algorithms and sizeschanged, etc.quoted
It might take a while before the end user realizes this and for example stops usingP-192 altogether,quoted
but what if this silent redirect by the driver actually breaks some securityassumptions (side-channelquoted
resistance being one potential example) made by this end user? The consequencescan be very bad.quoted
You might say: "this is the end user problem to verify this", but shouldn't we dosmth to prevent orquoted
at least indicate such potential issues to them?I don't think it is possible at the API level to define rules that will always produce the most secure combination of drivers. The priority fields are only used to convey relative performance (which is already semantically murky, given the lack of distinction between hardware with a single queue vs software algorithms that can be executed by all CPUs in parallel). When it comes to comparative security, trustworthiness or robustness of implementations, it is simply left up to the user to blacklist modules that they prefer not to use. When fallback allocations are made in the correct way, the remaining available implementations will be used in priority order.So, let me see if I understand the full picture correctly now and how to utilize the blacklisting of modules as a user. Suppose I want to blacklist everything but my OSC driver module. So, if I am as a user refer to a specific driver implementation using a unique driver name (ecdh-keembay-ocs in our case), then regardless of the fact that a driver implements this SW fallback for P-192 curve, if I am as a user to ask for P-192 curve (or any other param that results in SW fallback), I will be notified that this requested implementation does not provide it?
This is rather unusual compared with how the crypto API is typically used, but if this is really what you want to implement, you can do so by: - having a "ecdh" implementation that implements the entire range, and uses a fallback for curves that it does not implement - export the same implementation again as "ecdh" and with a known driver name "ecdh-keembay-ocs", but with a slightly lower priority, and in this case, return an error when the unimplemented curve is requested. That way, you fully adhere to the API, by providing implementations of all curves by default. And if a user requests "ecdh-keembay-ocs" explicitly, it will not be able to use the P192 curve inadvertently. But policing which curves are secure and which are not is really not the job of the API. We have implementations of MD5 and RC4 in the kernel that we would *love* to remove but we simply cannot do so as long as they are still being used. The same applies to P192: we simply cannot fail requests for that curve for use cases that were previously deemed valid. It is perfectly reasonable to omit the implementation from your hardware, but banning its use outright on the grounds that is no longer secure conflicts with our requirement not to break existing use cases.