Re: [RFC net-next 01/15] psp: add documentation
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2024-06-06 02:40:35
Jakub Kicinski wrote:
On Wed, 05 Jun 2024 16:11:31 -0400 Willem de Bruijn wrote:quoted
quoted
The retansmissions of K-A are unencrypted, to avoid sending the same data in encrypted and unencrypted form. This poses a risk if an ACK gets lost but both hosts end up in the PSP Tx state. Assume that Host A did not send the RPC (line 12), and the retransmission (line 14) happens as an RTO or TLP. Host B may already reach PSP Tx state (line "20") and expect encrypted data. Plain text retransmissions (with sequence number before rcv_nxt) must be accepted until Host B sees encrypted data from Host A.Is that sufficient if an initial encrypted packet could get reordered by the network to arrive before a plaintext retransmit of a lower seqno?Yes, I believe that's fine. I will document this clearer but both sides must be pretty precise in their understanding when the switchover happens. They must read what they expect to be clear text, and then install the Tx key thus locking down the socket. 1. If they under-read and clear text data is already queued - the kernel will error out. 2. If they under-read and clear text arrives later - the connection will hang. 3. If they over-read they will presumably get PSP-protected data which they have no way of validating, since it won't be secured by user crypto. We could protect from over-read (case 3) by refusing to give out PSP-protected data until keys are installed. But it adds to the fast path and I don't think it's all that beneficial, since there's no way to protect a sloppy application from under-read (case 2). Back to your question about reordering plain text with cipher text: the application should not lock down the socket until it gets all its clear text. So clear text retransmissions _after_ lock down must be spurious.
Ah yes, good point.
The only worry is that we lose an ACK and never tell the other side that we got all the clear text. But we're guaranteed to successfully ACK any PSP-protected data, so if we receive some there is no way to get stuck. Let me copy / paste the diagram: 01 p Host A Host B 02 l t ~~~~~~~~~~~[TCP 3 WHS]~~~~~~~~~~ 03 a e ~~~~~~[crypto negotiation]~~~~~~ 04 i x [Rx key alloc = K-B] 05 n t <--- [app] K-B key send 06 ------[Rx key alloc = K-A]- 07 [app] K-A key send -->| 08 [TCP] K-B input <----- 08 P [TCP] K-B ACK ---->| 09 S R [app] recv(K-B) | 10 P x [app] [Tx key set] | 11 -------------------------- 12 P T [app] send(RPC) #####>| 13 S x |<---- [TCP] Seq OoO! queue RPC, SACK 14 P [TCP] retr K-A --->| 15 | `-> [TCP] K-A input 16 | <--- [TCP] K-A ACK (or FIN) 17 | [app] recv(K-A) 18 | [app] [Tx key set] 19 ----------------------------------- 20 Looking as Host A, if we receive encrypted data, we must have allocated and sent key (line 7) so we will start accepting encrypted data. But at this point we are also accepting plain text (until we reach line 9). We will send a plain text (S)ACK to encrypted data, but that's fine too since Host B hasn't seen any encrypted data from us and will accept such ACKs.quoted
Both scenarios make sense. It is unfortunately harder to be sure that we have captured all edge cases.Are you trying to say packetdrill without saying packetdrill? :)
Ha, no, no such hint implied. I did expand packetdrill to PSP to exercise the cases that I could come up with, at a minimum to ensure coverage of all branches. But does that cover all edge cases possible? Including drops, reorders, geometry changes from MTU changes, SO_LINGER 0, races with slow OS operations (like that slow SADB insertion I mentioned)? The unknown unknowns. Stuck connections are a low risk, bugs that can be fixed later. As long as it is easy to reason that actual crypto issues like plaintext leaks are not reachable. Extending packetdrill to netlink would be quite some work, I suspect. A quick scan shows that it knows NLA, but only for OPT_STATS decoding.
quoted
An issue related to the rcv_nxt cut-point, not sure how important: the plaintext packet contents are protected by user crypto before upgrade. But the TCP headers are not. PSP relies on TCP PAWS against replay protection. It is possible for a MITM to offset all seqno from the start of connection establishment. I don't see an immediate issue. But at a minimum it could be possible to insert or delete before PSP is upgraded.Yes, the "cut off" point must be quite clearly defined, because both sides must precisely read out all the clear text. Then they install the Tx key and anything they read must have been PSP-protected. Hope I understood the point.
I think the issue, if any, is that there may be a gap between the two methods of integrity protection. What we call "cleartext" here is integrity protected such that no insertion or deletion attacks are possible. And PSP ensures the same. But is a a deletion of the last plaintext or first ciphertext possible? An insertion is not an issue as it will be protected by neither, while PSP is expected, so it is dropped. As long as the application (or is it presentation?) layer has a clear definition of at what point in the stream it must insert the Tx key, plaintext deletion is not possible, as the key is not inserted until all plaintext has been received. Which leaves: is it possible for a MITM to offset the seqno, such that the first PSP encrypted packet can be removed from the stream and this goes undetected?