Re: [PATCH] tun: Fix use-after-free in tun_net_xmit
From: Jason Wang <jasowang@redhat.com>
Date: 2019-05-05 09:09:56
Also in:
lkml
On 2019/4/30 上午12:38, Cong Wang wrote:
On Sun, Apr 28, 2019 at 7:23 PM Jason Wang [off-list ref] wrote:quoted
On 2019/4/29 上午1:59, Cong Wang wrote:quoted
On Sun, Apr 28, 2019 at 12:51 AM Jason Wang [off-list ref] wrote:quoted
quoted
tun_net_xmit() doesn't have the chance to access the change because it holding the rcu_read_lock().The problem is the following codes: --tun->numqueues; ... synchronize_net(); We need make sure the decrement of tun->numqueues be visible to readers after synchronize_net(). And in tun_net_xmit():It doesn't matter at all. Readers are okay to read it even they still use the stale tun->numqueues, as long as the tfile is not freed readers can read whatever they want...This is only true if we set SOCK_RCU_FREE, isn't it?Sure, this is how RCU is supposed to work.quoted
quoted
The decrement of tun->numqueues is just how we unpublish the old tfile, it is still valid for readers to read it _after_ unpublish, we only need to worry about free, not about unpublish. This is the whole spirit of RCU.The point is we don't convert tun->numqueues to RCU but use synchronize_net().Why tun->numqueues needs RCU? It is an integer, and reading a stale value is _perfectly_ fine.
I meant we don't want e.g tun_net_xmit() to see the stale value after synchronize_net() in __tun_detach(), since it has various other steps with the assumption that no tfile dereference from data path. E.g one example is XDP rxq information un-registering which looks racy in the case of XDP_TX.
If you actually meant to say tun->tfiles[] itself, no, it is a fixed-size array, it doesn't shrink or grow, so we don't need RCU for it. This is also why a stale tun->numqueues is fine, as long as it never goes out-of-bound.
We do kind of shrinking or growing through tun->numqueues. That's why we check against it in various places. But, of course this is buggy.
quoted
quoted
You need to rethink about my SOCK_RCU_FREE patch.The code is wrote before SOCK_RCU_FREE is introduced and assume no de-reference from device after synchronize_net(). It doesn't harm to figure out the root cause which may give us more confidence to the fix (e.g like SOCK_RCU_FREE).I believe SOCK_RCU_FREE is the fix for the root cause, not just a cover-up.quoted
I don't object to fix with SOCK_RCU_FREE, but then we should remove the redundant synchronize_net(). But I still prefer to synchronize everything explicitly like (completely untested):I agree that synchronize_net() can be removed. However I don't understand your untested patch at all, it looks like to fix a completely different problem rather than this use-after-free.
As has been mentioned, the problem of current code is that we still leave pointers to freed tfile in tfiles[] array in __tun_detach() and the check with tun->numqueues seems racy. So the patch just NULL out the detached tfile pointers and make sure no it can not be dereferenced from tfile after synchronize_net() by dereferencing tfile instead of checking tun->numqueues . Thanks
Thanks.