Thread (212 messages) 212 messages, 9 authors, 2018-04-25

Re: [PATCH v2 2/6] ethdev: add port ownership

From: Ananyev, Konstantin <hidden>
Date: 2018-01-17 17:01:14

-----Original Message-----
From: Neil Horman [mailto:nhorman@tuxdriver.com]
Sent: Wednesday, January 17, 2018 2:00 PM
To: Matan Azrad <redacted>
Cc: Ananyev, Konstantin <redacted>; Thomas Monjalon <redacted>; Gaetan Rivet
[off-list ref]; Wu, Jingjing [off-list ref]; dev@dpdk.org; Richardson, Bruce [off-list ref]
Subject: Re: [PATCH v2 2/6] ethdev: add port ownership

On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
quoted
Hi Konstantin
From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
quoted
Hi Matan,
quoted
Hi Konstantin

From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
quoted
Hi Matan,
quoted
Hi Konstantin
From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
quoted
Hi Matan,
quoted
Hi Konstantin
From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
quoted
Hi Matan,
quoted
Hi Konstantin
From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
AM
quoted
Hi Matan,
quoted
Hi Konstantin
From: Ananyev, Konstantin, Thursday, January 11, 2018
2:40 PM
quoted
Hi Matan,
quoted
Hi Konstantin
From: Ananyev, Konstantin, Wednesday, January 10,
2018
3:36 PM
quoted
Hi Matan,
 <snip>
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
It is good to see that now scanning/updating
rte_eth_dev_data[] is lock protected, but it
might be not very plausible to protect both
data[] and next_owner_id using the
same lock.
quoted
quoted
quoted
quoted
quoted
I guess you mean to the owner structure in
rte_eth_dev_data[port_id].
quoted
quoted
quoted
quoted
quoted
The next_owner_id is read by ownership APIs(for
owner validation), so it
makes sense to use the same lock.
quoted
Actually, why not?
Well to me next_owner_id and rte_eth_dev_data[] are
not directly
related.
quoted
quoted
You may create new owner_id but it doesn't mean you
would update rte_eth_dev_data[] immediately.
And visa-versa - you might just want to update
rte_eth_dev_data[].name or .owner_id.
It is not very good coding practice to use same lock
for non-related data structures.
I see the relation like next:
Since the ownership mechanism synchronization is in
ethdev responsibility, we must protect against user
mistakes as much as we can by
using the same lock.
quoted
So, if user try to set by invalid owner (exactly the
ID which currently is
allocated) we can protect on it.

Hmm, not sure why you can't do same checking with
different lock or atomic variable?
The set ownership API is protected by ownership lock and
checks the owner ID validity By reading the next owner ID.
So, the owner ID allocation and set API should use the
same atomic
mechanism.

Sure but all you are doing for checking validity, is  check
that owner_id > 0 &&& owner_id < next_ownwe_id, right?
As you don't allow owner_id overlap (16/3248 bits) you can
safely do same check with just atomic_get(&next_owner_id).
It will not protect it, scenario:
- current next_id is X.
- call set ownership of port A with owner id X by thread 0(by
user
mistake).
quoted
quoted
quoted
- context switch
- allocate new id by thread 1 and get X and change next_id to
X+1
atomically.
quoted
-  context switch
- Thread 0 validate X by atomic_read and succeed to take
ownership.
quoted
quoted
quoted
quoted
quoted
- The system loosed the port(or will be managed by two
entities) -
crash.
quoted
quoted

Ok, and how using lock will protect you with such scenario?
The owner set API validation by thread 0 should fail because the
owner
validation is included in the protected section.

Then your validation function would fail even if you'll use atomic
ops instead of lock.
No.
With atomic this specific scenario will cause the validation to pass.
Can you explain to me how?

rte_eth_is_valid_owner_id(uint16_t owner_id) {
              int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
UINT16_MAX);

	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
cur_owner_id) {
		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
		return 0;
	}
	return 1;
}

Let say your next_owne_id==X, and you invoke
rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
Explanation:
The scenario with locks:
next_owner_id = X.
Thread 0 call to set API(with invalid owner Y=X) and take lock.
Context switch.
Thread 1 call to owner_new and stuck in the lock.
Context switch.
Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
Context switch.
Thread 1 take the lock and update X to X+1, then, unlock the lock.
Everything is OK!

The same scenario with atomics:
next_owner_id = X.
Thread 0 call to set API(with invalid owner Y=X) and take lock.
Context switch.
Thread 1 call to owner_new and change X to X+1(atomically).
Context switch.
Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
Problem!

Matan is correct here, there is no way to preform parallel set operations using
just and atomic variable here, because multiple reads of next_owner_id need to
be preformed while it is stable.  That is to say rte_eth_next_owner_id must be
compared to RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.  If
you were to only use an atomic_read on such a variable, it could be incremented
by the owner_new function between the checks and an invalid owner value could
become valid because  a third thread incremented the next value.  The state of
next_owner_id must be kept stable during any validity checks
It could still be incremented between the checks - if let say different thread will
invoke new_onwer_id, grab the lock update counter, release the lock - all that
before the check.
But ok, there is probably no point to argue on that one any longer -
let's keep the lock here, nothing will be broken with it for sure.
That said, I really have to wonder why ownership ids are really needed here at
all.  It seems this design could be much simpler with the addition of a per-port
lock (and optional ownership record).  The API could consist of three
operations:

ownership_set
ownership_tryset
ownership_release
ownership_get
Ok, but how to distinguish who is the current owner of the port?
To make sure that only owner is allowed to perform control ops?
Konstantin
The first call simply tries to take the per-port lock (blocking if its already
locked)

The second call is a non-blocking version of the first

The third unlocks the port, allowing others to take ownership

The fourth returns whatever ownership record you want to encode with the lock.

The addition of all this id checking seems a bit overcomplicated

Neil
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help