Re: [RFC net-next 01/15] psp: add documentation
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2024-06-05 20:11:33
Jakub Kicinski wrote:
On Fri, 31 May 2024 09:56:42 -0400 Willem de Bruijn wrote:quoted
quoted
quoted
If one peer can enter the state where it drops all plaintext, while the other decides to close the connection before completing the upgrade, and thus sends a plaintext FIN. If (big if) that can happen, then the connection cannot be cleanly closed.Hm. And we can avoid this by only enforcing encryption of data-less segments once we've seen some encrypted data?That would help. It may also be needed to accept a pure ACK right at the upgrade seqno. Depends on the upgrade process. Which may be worth documenting explicitly: the system call and network packet exchange from when one peer initiates (by generating its local key) until the connection is fully encrypted. That also allows poking at the various edge cases that may happen if packets are lost, or when actions can race.Dunno if the format below is good, but you're very right. At least to me writing the diagram was an hour well spent :)
Great :)
quoted
One unexpected example of the latter that I came across was Tx SADB key insertion in tail edge cases taking longer than network RTT, for instance. The kernel API can be exercised in a variety of ways, not all of them will uphold the correctness. Documenting how it should be used should help. Even better when it reduces the option space. As it already does by failing a Tx key install until Rx is configured.Something along these lines? "Sequence" diagram of the worst case scenario: 01 p Host A Host B 02 l t ~~~~~~~~~~~[TCP 3 WHS]~~~~~~~~~~ 03 a e ~~~~~~[crypto negotiation]~~~~~~ 04 i x [Rx key alloc = K-B] 05 n t <--- [app] K-B key send 06 ------[Rx key alloc = K-A]- 07 [app] K-A key send -->| 08 [TCP] K-B input <----- 08 P [TCP] K-B ACK ---->| 09 S R [app] recv(K-B) | 10 P x [app] [Tx key set] | 11 -------------------------- 12 P T [app] send(RPC) #####>| 13 S x |<---- [TCP] Seq OoO! queue RPC, SACK 14 P [TCP] retr K-A --->| 15 | `-> [TCP] K-A input 16 | <--- [TCP] K-A ACK (or FIN) 17 | [app] recv(K-A) 18 | [app] [Tx key set] 19 ----------------------------------- 20 There is a causal dependency between Host B allocating the key (line 4), sending it (line 5) and Host A receiving it (line 8). Since Host B will accept PSP packets as soon as it allocated the key, Host A does not need to wait to start using the key (line 12). Host B will queue the RPC to the socket (line 13). [Problem #1] However, because Host B does not have a Tx key, the ACK / SACK packet (line 13) will not be encrypted. (Similarly if Host B decided to close the connection at this point, the resulting FIN packet would not be encrypted.)
Or if it plays SO_LINGER games the resulting RST.
Host B needs to accept unencrypted non-data segments (pure acks, pure FIN) until it sees an encrypted packet from Host B. [Problem #2] The retansmissions of K-A are unencrypted, to avoid sending the same data in encrypted and unencrypted form. This poses a risk if an ACK gets lost but both hosts end up in the PSP Tx state. Assume that Host A did not send the RPC (line 12), and the retransmission (line 14) happens as an RTO or TLP. Host B may already reach PSP Tx state (line "20") and expect encrypted data. Plain text retransmissions (with sequence number before rcv_nxt) must be accepted until Host B sees encrypted data from Host A.
Is that sufficient if an initial encrypted packet could get reordered by the network to arrive before a plaintext retransmit of a lower seqno? Both scenarios make sense. It is unfortunately harder to be sure that we have captured all edge cases. An issue related to the rcv_nxt cut-point, not sure how important: the plaintext packet contents are protected by user crypto before upgrade. But the TCP headers are not. PSP relies on TCP PAWS against replay protection. It is possible for a MITM to offset all seqno from the start of connection establishment. I don't see an immediate issue. But at a minimum it could be possible to insert or delete before PSP is upgraded.
With that I think the state machine needs to be amended: Event | Normal TCP | Rx PSP | Tx PSP | PSP full | ----------------------------------------------------------------------- Rx plain (new) | accept | accept | drop | drop | Rx plain | accept | accept | accept | drop | (ACK|FIN|rtx) | | | | | Rx PSP (good) | drop | accept | accept | accept | Rx PSP (bad | drop | drop | drop | drop | (crypt, !=SPI) | | | | | Tx | plain text | plain text | encrypted | encrypted | | | | (excl. rtx) | (excl. rtx) |quoted
quoted
quoted
Another example where a peer stays open and stays retrying if it has upgraded and drops all plaintext.May want to always allow plaintext RSTs. This is a potential DoS vector.Because of key exhaustion? Or we can be tricked into spamming someone with retranmissions and ignoring their RST?
Simpler: this falls back onto unencrypted TCP where someone capable of spoofing valid data is capable of terminating a connection. If denying all plaintext after upgrade, PSP protects against this. It is arguably low on the list of concerns, especially in a closed world hyperscaler setting. As it is hardly the only DoS vector.
quoted
In all these cases, I suppose this has already been figured out for TLS.Assuming the answer above is "key exhaustion" - I wouldn't be surprised if it wasn't :(