Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4
From: Elad Nachman <hidden>
Date: 2021-02-26 17:43:54
The way the kernel handles its locks and lists for the dev close many path, there is no way you can go around this with rtnl unlocked : " There is a race condition in __dev_close_many() processing the close_list while the application terminates. It looks like if two vEth devices are terminating, and one releases the rtnl lock, the other takes it, updating the close_list in an unstable state, causing the close_list to become a circular linked list, hence list_for_each_entry() will endlessly loop inside __dev_close_many() . " And I don't expect David Miller will bend the kernel networking for DPDK or KNI. But - Stephen - if you can personally convince David to accept a kernel patch which will separate the close_list locking mechanism to a separate (RCU?) lock, then I can introduce first a patch to the kernel which will add a lock for the close_list, this way rtnl can be unlocked for the if down case. After that kernel patch, your original patch + relocation of the sync mutex locking will do the job . Otherwise, rtnl has to be kept locked all of the way for the if down case in order to prevent corruption causing a circular linked list out of the close_list, causing a hang in the kernel. Currently, the rtnl lock is the only thing keeping the close_list from corruption. If you doubt rtnl cannot be unlocked for dev close path, you can consult David for his opinion, as I think it is critical to understand what the kernel can or cannot do, or expects to be done before we can unlock its locks as we wish inside rte_kni.ko . Otherwise, if we are still in disagreement on how to patch this set of problems, I think the responsible way around it is to completely remove kni from the main dpdk tree and move it to dpdk-kmods repository. I know BSD style open-source does not carry legal responsibility from the developers, but I think when a bunch of developers know a piece of code is highly buggy, they should not leave it for countless new users to bounce their head desperately against, if they cannot agree on a correct way to solve the bunch of problems, of which I think we all agree exist (we just do not agree on the proper solution or patch)... That's my two cents, Elad. On Fri, Feb 26, 2021 at 5:49 PM Stephen Hemminger [off-list ref] wrote:
On Fri, 26 Feb 2021 00:01:01 +0300 Igor Ryzhov [off-list ref] wrote:quoted
Hi Elad, Thanks for the patch, but this is still NACK from me. The only real advantage of KNI over other exceptional-path techniques like virtio-user is the ability to configure DPDK-managed interfaces directly from the kernel using well-known utils like iproute2. A very important part of this is getting responses from the DPDK app and knowing the actual result of command execution. If you're making async requests to the application and you don't know the result, then what's the point of using KNI at all? IgorDo you have a better proposal that keeps the request result but does not call userspace with lock held. PS: I also have strong dislike of KNI, as designed it would have been rejected by Linux kernel developers. A better solution would be userspace version of something like devlink devices. But doing control operations by proxy is a locking nightmare.