Re: [PATCH v8 04/19] ethdev: introduce device lock
From: Zhang, Qi Z <hidden>
Date: 2018-07-05 09:54:32
-----Original Message----- From: Thomas Monjalon [mailto:thomas@monjalon.net] Sent: Thursday, July 5, 2018 3:23 PM To: Zhang, Qi Z <redacted> Cc: dev@dpdk.org; Burakov, Anatoly <redacted>; Ananyev, Konstantin [off-list ref]; Richardson, Bruce [off-list ref]; Yigit, Ferruh [off-list ref]; Shelton, Benjamin H [off-list ref]; Vangati, Narender [off-list ref]; arybchenko@solarflare.com Subject: Re: [dpdk-dev] [PATCH v8 04/19] ethdev: introduce device lock 05/07/2018 05:37, Zhang, Qi Z:quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]quoted
05/07/2018 03:38, Zhang, Qi Z:quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]quoted
04/07/2018 12:49, Zhang, Qi Z:quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]quoted
04/07/2018 03:47, Zhang, Qi Z:quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]quoted
03/07/2018 17:08, Zhang, Qi Z:quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]quoted
02/07/2018 07:44, Qi Zhang:quoted
Introduce API rte_eth_dev_lock and rte_eth_dev_unlock to let application lock or unlock on specific ethdev, a locked device can't be detached, this help applicaiton to prevent unexpected device detaching, especially in multi-processenvrionment.quoted
quoted
quoted
quoted
quoted
quoted
Trying to understand: a process of an application could try to detach a port while another process is against thisdecision.quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Why an application needs to be protected against itself?I think we can regard this as a help function, it help application to simplifiedthe situation when one process want to detach a device while another one is still using it.quoted
Application can register a callback which can do to necessary clean up (likestop traffic, release memory ...) before device be detached. Yes I agree such hook can be a good idea.[...]quoted
quoted
quoted
quoted
After all, it is just a pre-detach hook.quoted
Wait, how is it different of RTE_ETH_EVENT_DESTROY callback? Perhaps we just need to improve the handling of the DESTROYevent?quoted
quoted
quoted
quoted
quoted
I have thought about this before. Not like RTE_ETH_EVENT_DESTROY and other event hook, the hook hereneed to give feedback, pass or fail will impact the following behavior, this make it special, so I separate it from all exist rte_eth_event_type handle mechanism. Look at _rte_eth_dev_callback_process, there is a "ret_param".OK, that should work.quoted
quoted
The alternative solution is we just introduce a new event type like RTE_ETH_EVENT_PRE_DETACH and reuse all exist APIrte_eth_dev_callback_register/rte_eth_dev_callback_unregister. I don't think we need a new event. Let's try to use RTE_ETH_EVENT_DESTROY.The problem is RTE_ETH_EVENT_DESTROY is used inrte_eth_dev_release_port already.quoted
And in PMD, rte_eth_dev_release_port is called after dev_uninit, that mean its too late to reject a detachYou're right. It's a real mess currently. The right order should be to remove ethdev ports before removing the underlying EAL device. But it's strangely not the case. We need to separate things. The function rte_eth_dev_close can be used to remove an ethdev port if we add a call to rte_eth_dev_release_port. So we could call rte_eth_dev_close in PMD remove functions. Is "close" a good time to ask confirmation to the application? Or should we ask confirmation a step before, on "stop"?I think the confirmation should before any cleanup stage, it should at thebeginning of driver->remove. So you stop a port, even if the app policy is against detaching it?My understanding is, stop and detach is different, we may stop a device andreconfigure it then restart it.quoted
but for detach, properly we will not use it, unless it be probed again. For dev_close , it should be called after dev_stop. so we have to like below. If (dev->started) { dev_stop /* but still problem here, if traffic is ongoing */ if (dev_close()) { dev_start() return -EBUSY. } } else { If (dev_close()) Return _EBUSY } So for me, neither rte_eth_dev_stop and rte_eth_dev_close is the right placeto check this.quoted
But rte_eth_dev_destroy looks like a good one. We can put all the ethdev general logic into it, and PMD specific dev_unit will be called at lastIf you want to detach a port, you need to stop it. If one process try to detach a port, but another process decides (via callback) that the port should not be detached, you will have stopped a port for no good reason. To me it is a real design issue.
Yes, so I think we still need two iterates for detach. First iterate to get agreement on all processes. Secondary iterate to do the detach. But how?
quoted
quoted
quoted
Also we should not put it into rte_eth_dev_stop, because, rte_eth_dev_stopcan invoked by application directly, in that case, we don't what any callback be invoked. It it the same to detach a port: it is invoked directly by application. I thought you wanted a callback as helper for inter-process management?quoted
quoted
quoted
So , do you mean we can remove _rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROPY) in rte_eth_dev_release_portI would say we need RTE_ETH_EVENT_DESTROY to notify that the port is really destroyed. Maybe the right thing to do is to add a new event RTE_ETH_EVENT_CLOSE_REQUEST or something else. Note that we already have 2 removal events in ethdev: - RTE_ETH_EVENT_INTR_RMV when the port cannot be usedanymorequoted
quoted
quoted
quoted
- RTE_ETH_EVENT_DESTROY when the port is going to be deletedquoted
And where is right place to call_rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROY)?quoted
If can't be called in rte_eth_dev_detach, because if device is removed byrte_eal_hotplug_remove, it will be skipped. No, rte_eth_dev_detach and rte_eal_hotplug_remove are 2 differentthings.quoted
quoted
One is a mix of ethdev and EAL (and should be deprecated), the other one is for the underlying device at EAL level.quoted
probably we need to call this at the beginning of each PMD driver->remove?,that means, we need to change all PMD drivers? Yes, we can call rte_eth_dev_stop and rte_eth_dev_close at the beginning of PMD remove. Note that there is already a helper rte_eth_dev_destroy called in some PMD to achieve the removal, but curiously, it doesn't call stop andclose functions.quoted
Currently PMD implement driver->remove with different way,rte_eth_dev_stop / rte_eth_dev_close / rte_eth_dev_destroy is not always be invoked.quoted
So Before we standardize what ethdev API and what sequence should be called in driver->remove (I think this is a separate task) I will suggest 1. Create another help function like _rte_eth_dev_allow_to_remove, 2. the help function will call_rte_eth_dev_callback_process(RTE_ETH_EVENT_PRE_REMOVE) andupdatequoted
quoted
ret_param which contain a reject count.quoted
3. the help function should to invoked at beginning at driver->remove anddriver->remove will abort if the help function failed.quoted
But once we standardized that , we can do cleanup to merge it into anotherrte_eth_xxx API in next step.quoted
What do you think?No All the problems we have today are because we preferred add more and more functions instead of fixing the basic stuff. And it is especially the case for all the detach crap. So no. Let's fix stuff first.