Thread (488 messages) 488 messages, 14 authors, 2018-10-16

Re: [PATCH v8 04/19] ethdev: introduce device lock

From: Zhang, Qi Z <hidden>
Date: 2018-07-05 09:54:32

-----Original Message-----
From: Thomas Monjalon [mailto:thomas@monjalon.net]
Sent: Thursday, July 5, 2018 3:23 PM
To: Zhang, Qi Z <redacted>
Cc: dev@dpdk.org; Burakov, Anatoly <redacted>; Ananyev,
Konstantin [off-list ref]; Richardson, Bruce
[off-list ref]; Yigit, Ferruh [off-list ref]; Shelton,
Benjamin H [off-list ref]; Vangati, Narender
[off-list ref]; arybchenko@solarflare.com
Subject: Re: [dpdk-dev] [PATCH v8 04/19] ethdev: introduce device lock

05/07/2018 05:37, Zhang, Qi Z:
quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]
quoted
05/07/2018 03:38, Zhang, Qi Z:
quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]
quoted
04/07/2018 12:49, Zhang, Qi Z:
quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]
quoted
04/07/2018 03:47, Zhang, Qi Z:
quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]
quoted
03/07/2018 17:08, Zhang, Qi Z:
quoted
From: Thomas Monjalon [mailto:thomas@monjalon.net]
quoted
02/07/2018 07:44, Qi Zhang:
quoted
Introduce API rte_eth_dev_lock and
rte_eth_dev_unlock to let application lock or
unlock on specific ethdev, a locked device can't
be detached, this help applicaiton to prevent
unexpected device detaching, especially in
multi-process
envrionment.
quoted
quoted
quoted
quoted
quoted
quoted
Trying to understand: a process of an application
could try to detach a port while another process is
against this
decision.
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Why an application needs to be protected against itself?
I think we can regard this as a help function, it help
application to simplified
the situation when one process want to detach a device
while another one is still using it.
quoted
Application can register a callback which can do to
necessary clean up (like
stop traffic, release memory ...) before device be detached.

Yes I agree such hook can be a good idea.
[...]
quoted
quoted
quoted
quoted
After all, it is just a pre-detach hook.
quoted
Wait, how is it different of RTE_ETH_EVENT_DESTROY callback?
Perhaps we just need to improve the handling of the
DESTROY
event?
quoted
quoted
quoted
quoted
quoted
I have thought about this before.
Not like RTE_ETH_EVENT_DESTROY and other event hook, the
hook here
need to give feedback, pass or fail will impact the
following behavior, this make it special, so I separate it
from all exist rte_eth_event_type handle mechanism.

Look at _rte_eth_dev_callback_process, there is a "ret_param".
OK, that should work.
quoted
quoted
The alternative solution is we just introduce a new event
type like RTE_ETH_EVENT_PRE_DETACH and reuse all exist API
rte_eth_dev_callback_register/rte_eth_dev_callback_unregister.

I don't think we need a new event.
Let's try to use RTE_ETH_EVENT_DESTROY.
The problem is RTE_ETH_EVENT_DESTROY is used in
rte_eth_dev_release_port already.
quoted
And in PMD, rte_eth_dev_release_port is called after
dev_uninit, that mean its too late to reject a detach
You're right.

It's a real mess currently.
The right order should be to remove ethdev ports before removing
the underlying EAL device. But it's strangely not the case.

We need to separate things.
The function rte_eth_dev_close can be used to remove an ethdev
port if we add a call to rte_eth_dev_release_port.
So we could call rte_eth_dev_close in PMD remove functions.
Is "close" a good time to ask confirmation to the application?
Or should we ask confirmation a step before, on "stop"?
I think the confirmation should before any cleanup stage, it
should at the
beginning of driver->remove.

So you stop a port, even if the app policy is against detaching it?
My understanding is, stop and detach is different, we may stop a device and
reconfigure it then restart it.
quoted
but for detach, properly we will not use it, unless it be probed again.
For dev_close , it should be called after dev_stop.
so we have to like below.

If (dev->started) {
	dev_stop /* but still problem here, if traffic is ongoing */
	if (dev_close()) {
		dev_start()
		return -EBUSY.
	}
} else {
	If (dev_close())
		Return _EBUSY
}

So for me, neither rte_eth_dev_stop and rte_eth_dev_close is the right place
to check this.
quoted
But rte_eth_dev_destroy looks like a good one. We can put all the
ethdev general logic into it, and PMD specific dev_unit will be called
at last
If you want to detach a port, you need to stop it.
If one process try to detach a port, but another process decides (via callback)
that the port should not be detached, you will have stopped a port for no good
reason.
To me it is a real design issue.
Yes, so I think we still need two iterates for detach.
First iterate to get agreement on all processes.
Secondary iterate to do the detach.

But how?
quoted
quoted
quoted
Also we should not put it into rte_eth_dev_stop, because,
rte_eth_dev_stop
can invoked by application directly, in that case, we don't what any
callback be invoked.

It it the same to detach a port: it is invoked directly by application.
I thought you wanted a callback as helper for inter-process management?
quoted
quoted
quoted
So , do you mean we can remove
_rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROPY) in
rte_eth_dev_release_port
I would say we need RTE_ETH_EVENT_DESTROY to notify that the
port is really destroyed.
Maybe the right thing to do is to add a new event
RTE_ETH_EVENT_CLOSE_REQUEST or something else.
Note that we already have 2 removal events in ethdev:
	- RTE_ETH_EVENT_INTR_RMV when the port cannot be used
anymore
quoted
quoted
quoted
quoted
	- RTE_ETH_EVENT_DESTROY when the port is going to be deleted
quoted
And where is right place to call
_rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROY)?
quoted
If can't be called in rte_eth_dev_detach, because if device is
removed by
rte_eal_hotplug_remove, it will be skipped.

No, rte_eth_dev_detach and rte_eal_hotplug_remove are 2
different
things.
quoted
quoted
One is a mix of ethdev and EAL (and should be deprecated), the
other one is for the underlying device at EAL level.
quoted
probably we need to call this at the beginning of each PMD
driver->remove?,
that means, we need to change all PMD drivers?

Yes, we can call rte_eth_dev_stop and rte_eth_dev_close at the
beginning of PMD remove.
Note that there is already a helper rte_eth_dev_destroy called
in some PMD to achieve the removal, but curiously, it doesn't
call stop and
close functions.
quoted
Currently PMD implement driver->remove with different way,
rte_eth_dev_stop / rte_eth_dev_close / rte_eth_dev_destroy is not
always be invoked.
quoted
So Before we standardize what ethdev API and what sequence should
be called in driver->remove (I think this is a separate task) I
will suggest 1. Create another help function like
_rte_eth_dev_allow_to_remove, 2. the help function will call
_rte_eth_dev_callback_process(RTE_ETH_EVENT_PRE_REMOVE) and
update
quoted
quoted
ret_param which contain a reject count.
quoted
3. the help function should to invoked at beginning at
driver->remove and
driver->remove will abort if the help function failed.
quoted
But once we standardized that , we can do cleanup to merge it into
another
rte_eth_xxx API in next step.
quoted
What do you think?
No
All the problems we have today are because we preferred add more and
more functions instead of fixing the basic stuff. And it is
especially the case for all the detach crap.
So no.
Let's fix stuff first.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help