Thread (42 messages) 42 messages, 8 authors, 2017-02-20

Re: [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

From: Iremonger, Bernard <hidden>
Date: 2017-02-16 11:39:15

Hi Ferruh,
-----Original Message-----
From: Richardson, Bruce
Sent: Thursday, February 16, 2017 9:14 AM
To: Yigit, Ferruh <redacted>
Cc: Thomas Monjalon <redacted>; Iremonger, Bernard
[off-list ref]; Ananyev, Konstantin
[off-list ref]; Doherty, Declan
[off-list ref]; DPDK [off-list ref]
Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
burst functions

On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
quoted
On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon)
wrote:
quoted
quoted
2016-06-16 16:41, Iremonger, Bernard:
quoted
Hi Thomas,
<snip>
quoted
2016-06-16 15:32, Bruce Richardson:
quoted
On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard
wrote:
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Why does this particular PMD need spinlocks when doing RX and
TX, while other device types do not? How is adding/removing
devices from a bonded device different to other control
operations that can be done on physical PMDs? Is this not
similar to say bringing down or hotplugging out a physical port
just before an RX or TX
operation takes place?
quoted
quoted
quoted
For all other PMDs we rely on the app to synchronise control
and data plane operation - why not here?

/Bruce
This issue arose during VM live migration testing.
For VM live migration it is necessary (while traffic is running)
to be able to
remove a bonded slave device, stop it, close it and detach it.
quoted
quoted
It a slave device is removed from a bonded device while traffic
is running
a segmentation fault may occur in the rx/tx burst function. The
spinlock has been added to prevent this occurring.
quoted
quoted
The bonding device already uses a spinlock to synchronise
between the
add and remove functionality and the
slave_link_status_change_monitor code.
quoted
quoted
Previously testpmd did not allow, stop, close or detach of PMD
while traffic was running. Testpmd has been modified with the
following patchset

http://dpdk.org/dev/patchwork/patch/13472/

It now allows stop, close and detach of a PMD provided in it is
not
forwarding and is not a slave of bonded PMD.
quoted
quoted
I will admit to not being fully convinced, but if nobody else has
any serious objections, and since this patch has been reviewed
and acked, I'm ok to merge it in. I'll do so shortly.
Please hold on.
Seeing locks introduced in the Rx/Tx path is an alert.
We clearly need a design document to explain where locks can be
used and what are the responsibility of the control plane.
If everybody agrees in this document that DPDK can have some locks
in the fast path, then OK to merge it.

So I would say NACK for 16.07 and maybe postpone to 16.11.
Looking at the documentation for the bonding PMD.
http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_li
quoted
quoted
quoted
b.html

In section 10.2 it states the following:

Bonded devices support the dynamical addition and removal of slave
devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove
APIs.
quoted
quoted
quoted
If a slave device is added or removed while traffic is running, there is the
possibility of a segmentation fault in the rx/tx burst functions. This is most
likely to occur in the round robin bonding mode.
quoted
quoted
quoted
This patch set fixes what appears to be a bug in the bonding PMD.
It can be fixed by removing this statement in the doc.

One of the design principle of DPDK is to avoid locks.
quoted
Performance measurements have been made with this patch set
applied and without the patches applied using 64 byte packets.
quoted
quoted
quoted
With the patches applied the following drop in performance was
observed:
quoted
quoted
quoted
% drop for fwd+io:	0.16%
% drop for fwd+mac:	0.39%

This patch set has been reviewed and ack'ed, so I think it should
be applied in 16.07
I understand your point of view and I gave mine.
Now we need more opinions from others.
Hi,

These patches are sitting in the patchwork for a long time. Discussion
never concluded and patches kept deferred each release.

I think we should give a decision about them:

1- We can merge them in this release, they are fixing a valid problem,
and patches are already acked.

2- We can reject them, if not having them for more than six months not
caused a problem, perhaps they are not really that required. And if
somebody needs them in the future, we can resurrect them from
patchwork.
quoted
I vote for option 2, any comments?
+1 on option 2. There are obviously not badly needed if nobody is asking
for them for over six months.

	/Bruce
I am ok with option 2, provided they can be retrieved if needed.

Regards,

Bernard.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help