Re: [PATCH RFC net v2 1/3] vsock: Fix transport_{g2h,h2g} TOCTOU
From: Stefano Garzarella <sgarzare@redhat.com>
Date: 2025-07-01 10:34:38
Also in:
lkml, virtualization
On Mon, Jun 30, 2025 at 01:02:26PM +0200, Michal Luczaj wrote:
On 6/30/25 11:05, Stefano Garzarella wrote:quoted
On Sun, Jun 29, 2025 at 11:26:12PM +0200, Michal Luczaj wrote:quoted
On 6/27/25 10:02, Stefano Garzarella wrote:quoted
On Wed, Jun 25, 2025 at 11:23:30PM +0200, Michal Luczaj wrote:quoted
On 6/25/25 10:43, Stefano Garzarella wrote:quoted
On Fri, Jun 20, 2025 at 09:52:43PM +0200, Michal Luczaj wrote:quoted
vsock_find_cid() and vsock_dev_do_ioctl() may race with module unload. transport_{g2h,h2g} may become NULL after the NULL check. Introduce vsock_transport_local_cid() to protect from a potential null-ptr-deref. KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f] RIP: 0010:vsock_find_cid+0x47/0x90 Call Trace: __vsock_bind+0x4b2/0x720 vsock_bind+0x90/0xe0 __sys_bind+0x14d/0x1e0 __x64_sys_bind+0x6e/0xc0 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f] RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0 Call Trace: __x64_sys_ioctl+0x12d/0x190 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <redacted>...quoted
quoted
Oh, and come to think of it, we don't really need that (easily contended?) mutex here. Same can be done with RCU. Which should speed up vsock_bind() -> __vsock_bind() -> vsock_find_cid(), right? This is what I mean, roughly: +static u32 vsock_registered_transport_cid(const struct vsock_transport __rcu **trans_ptr) +{ + const struct vsock_transport *transport; + u32 cid = VMADDR_CID_ANY; + + rcu_read_lock(); + transport = rcu_dereference(*trans_ptr); + if (transport) + cid = transport->get_local_cid(); + rcu_read_unlock(); + + return cid; +} ...@@ -2713,6 +2726,7 @@ void vsock_core_unregister(const structvsock_transport *t) transport_local = NULL; mutex_unlock(&vsock_register_mutex); + synchronize_rcu(); } I've realized I'm throwing multiple unrelated ideas/questions, so let me summarise: 1. Hackish macro can be used to guard against calling vsock_registered_transport_cid() on a non-static variable. 2. We can comment the function to add some context and avoid confusion.I'd go with 2.All right, will do.quoted
quoted
3. Instead of taking mutex in vsock_registered_transport_cid() we can use RCU.Since the vsock_bind() is not in the hot path, maybe a mutex is fine. WDYT?I wrote a benchmark that attempts (and fails due to a non-existing CID) to bind() 100s of vsocks in multiple threads. `perf lock con` shows that this mutex is contended, and things are slowed down by 100+% compared with RCU approach. Which makes sense: every explicit vsock bind() across the whole system would need to acquire the mutex. And now we're also taking the same mutex in vsock_assign_transport(), i.e. during connect(). But maybe such stress testing is just unrealistic, I really don't know.
I still don't think it's a hot path to optimize, but I'm not totally against it. If you want to do it though, I would say do it in a separate patch. Thanks, Stefano