Re: [PATCH] net: tls: fix possible race condition between do_tls_getsockopt_conf() and do_tls_setsockopt_conf()
From: Hangyu Hua <hidden>
Date: 2023-02-27 03:26:30
Also in:
lkml
On 25/2/2023 06:17, Jakub Kicinski wrote:
On Fri, 24 Feb 2023 22:48:57 +0100 Sabrina Dubroca wrote:quoted
2023-02-24, 13:06:25 -0800, Jakub Kicinski wrote:quoted
On Fri, 24 Feb 2023 21:22:43 +0100 Sabrina Dubroca wrote:[...]quoted
quoted
I suggested a change of locking in do_tls_getsockopt_conf this morning [1]. The issue reported last seemed valid, but this patch is not at all what I had in mind. [1] https://lore.kernel.org/all/Y/ht6gQL+u6fj3dG@hog/Ack, I read the messages out of order, sorry.quoted
do_tls_setsockopt_conf fills crypto_info immediately from what userspace gives us (and clears it on exit in case of failure), which getsockopt could see since it's not locking the socket when it checks TLS_CRYPTO_INFO_READY. So getsockopt would progress up to the point it finally locks the socket, but if setsockopt failed, we could have cleared TLS_CRYPTO_INFO_READY and freed iv/rec_seq.Makes sense. We should just take the socket lock around all of do_tls_getsockopt(), then?That would make things simple and consistent. My idea was just taking the existing lock_sock in do_tls_getsockopt_conf out of the switch and put it just above TLS_CRYPTO_INFO_READY.
I know what you mean. I just think lock crypto_info can fix this simply. The original situation is: thread1 thread2(do_tls_getsockopt_conf) lock_sock(sk) do_tls_setsockopt_conf(crypto_info->cipher_type set) crypto_info = xxx cctx = &ctx->tx if(!TLS_CRYPTO_INFO_READY(crypto_info)) tls_set_device_offload(kmalloc cctx->iv) tls_set_sw_offload(fail and cctx->iv may not set to NULL) do_tls_setsockopt_conf(set crypto_info->cipher_type to NULL) release_sock(sk) lock_sock(sk) memcpy(xxx, cctx->iv, xxx) release_sock(sk) If we lock crypto_info: thread1 thread2(do_tls_getsockopt_conf) lock_sock(sk) do_tls_setsockopt_conf(crypto_info->cipher_type set) tls_set_device_offload(kmalloc cctx->iv) tls_set_sw_offload(fail and cctx->iv may not set to NULL) do_tls_setsockopt_conf(set crypto_info->cipher_type to NULL) release_sock(sk) lock_sock(sk) crypto_info = xxx cctx = &ctx->tx release_sock(sk) if(!TLS_CRYPTO_INFO_READY(crypto_info)) lock_sock(sk) memcpy(xxx, cctx->iv, xxx) release_sock(sk)
quoted
While we're at it, should we move the ctx->prot_info.version != TLS_1_3_VERSION check in do_tls_setsockopt_no_pad under lock_sock?Yes, or READ_ONCE(), same for do_tls_getsockopt_tx_zc() and its access on ctx->zerocopy_sendfile.quoted
I don't think that can do anything wrong (we'd have to get past this check just before a failing setsockopt clears crypto_info, and even then we're just reading a bit from the context), it just looks a bit strange. Or just lock the socket around all of do_tls_setsockopt_no_pad, like the other options we have.The delayed locking feels like a premature optimization, we'll keep having such issues with new options. Hence my vote to lock all of do_tls_getsockopt().
In order to reduce ambiguity, I think it may be a good idea only to
lock do_tls_getsockopt_conf() like we did in do_tls_setsockopt()
It will look like:
static int do_tls_getsockopt(struct sock *sk, int optname,
char __user *optval, int __user *optlen)
{
int rc = 0;
switch (optname) {
case TLS_TX:
case TLS_RX:
+ lock_sock(sk);
rc = do_tls_getsockopt_conf(sk, optval, optlen,
optname == TLS_TX);
+ release_sock(sk);
break;
case TLS_TX_ZEROCOPY_RO:
rc = do_tls_getsockopt_tx_zc(sk, optval, optlen);
break;
case TLS_RX_EXPECT_NO_PAD:
rc = do_tls_getsockopt_no_pad(sk, optval, optlen);
break;
default:
rc = -ENOPROTOOPT;
break;
}
return rc;
}
Of cause, I will clean the lock in do_tls_getsockopt_conf(). What do you
guys think?