Re: [regression] TCP_MD5SIG on established sockets
From: Eric Dumazet <edumazet@google.com>
Date: 2020-07-01 02:18:01
Also in:
lkml
On Tue, Jun 30, 2020 at 7:02 PM Herbert Xu [off-list ref] wrote:
Eric Dumazet [off-list ref] wrote:quoted
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 810cc164f795f8e1e8ca747ed5df51bb20fec8a2..ecc0e3fabce8b03bef823cbfc5c1b0a9e24df124100644--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c@@ -4034,9 +4034,12 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key) { struct scatterlist sg; + u8 keylen = key->keylen; - sg_init_one(&sg, key->key, key->keylen); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, key->keylen); + smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */ + + sg_init_one(&sg, key->key, keylen); + ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen); return crypto_ahash_update(hp->md5_req); } EXPORT_SYMBOL(tcp_md5_hash_key);diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ad6435ba6d72ffd8caf783bb25cad7ec151d6909..99916fcc15ca0be12c2c133ff40516f79e6fdf7f100644--- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c@@ -1113,6 +1113,9 @@ int tcp_md5_do_add(struct sock *sk, const uniontcp_md5_addr *addr, if (key) { /* Pre-existing entry - just update that one. */ memcpy(key->key, newkey, newkeylen); + + smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */ + key->keylen = newkeylen; return 0; }This doesn't make sense. Your smp_rmb only guarantees that you see a version of key->key that's newer than keylen. What if the key got changed twice? You could still read more than what's in the key (if that's what you're trying to protect against).
The worst that can happen is a dropped packet. Which is anyway going to happen anyway eventually if an application changes a TCP MD5 key on a live flow. The main issue of the prior code was the double read of key->keylen in tcp_md5_hash_key(), not that few bytes could change under us. I used smp_rmb() to ease backports, since old kernels had no READ_ONCE()/WRITE_ONCE(), but ACCESS_ONCE() instead.