Re: [PATCH net 2/2] net/tls: Fix use-after-free after the TLS device goes down and up
From: Maxim Mikityanskiy <hidden>
Date: 2021-05-25 08:52:41
On 2021-05-24 19:15, Jakub Kicinski wrote:
On Mon, 24 May 2021 15:12:20 +0300 Maxim Mikityanskiy wrote:quoted
@@ -1290,6 +1304,26 @@ static int tls_device_down(struct net_device *netdev) spin_unlock_irqrestore(&tls_device_lock, flags); list_for_each_entry_safe(ctx, tmp, &list, list) { + /* Stop offloaded TX and switch to the fallback. + * tls_is_sk_tx_device_offloaded will return false. + */ + WRITE_ONCE(ctx->sk->sk_validate_xmit_skb, tls_validate_xmit_skb_sw); + + /* Stop the RX and TX resync. + * tls_dev_resync must not be called after tls_dev_del. + */ + WRITE_ONCE(ctx->netdev, NULL); + + /* Start skipping the RX resync logic completely. */ + set_bit(TLS_RX_DEV_DEGRADED, &ctx->flags); + + /* Sync with inflight packets. After this point: + * TX: no non-encrypted packets will be passed to the driver. + * RX: resync requests from the driver will be ignored. + */ + synchronize_net(); + + /* Release the offload context on the driver side. */ if (ctx->tx_conf == TLS_HW) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX);Can we have the Rx resync take the device_offload_lock for read instead? Like Tx already does?
I believe you previously made this attempt in commit 38030d7cb779
("net/tls: avoid NULL-deref on resync during device removal"), and this
approach turned out to be problematic, as explained in commit
e52972c11d6b ("net/tls: replace the sleeping lock around RX resync with
a bit lock"): "RX resync may get called from soft IRQ, so we can't use
the rwsem".
quoted
+EXPORT_SYMBOL_GPL(tls_validate_xmit_skb_sw);Why the export?
Because tls_validate_xmit_skb was also exported. Now I see it's needed for tls_validate_xmit_skb, because tls_is_sk_tx_device_offloaded needs its address and can be called from the drivers. There is no similar public function for tls_validate_xmit_skb_sw, so you are probably right that we don't need to export it.