Thread (155 messages) 155 messages, 17 authors, 2017-07-21

Re: [RFC 0/8] mbuf: structure reorganization

From: Olivier Matz <hidden>
Date: 2017-02-16 13:48:10

Hi Konstantin,

Thanks for the feedback.
Comments inline.


On Mon, 6 Feb 2017 18:41:27 +0000, "Ananyev, Konstantin"
[off-list ref] wrote:
Hi Olivier,
Looks good in general, some comments from me below.
Thanks
Konstantin
quoted
The main changes are:
- reorder structure to increase vector performance on some non-ia
  platforms.
- add a 64bits timestamp field in the 1st cache line  
Wonder why it deserves to be in first cache line?
How it differs from seqn below (pure SW stuff right now).
In case the timestamp is set from a NIC value, it is set in the Rx
path. So that's why I think it deserve to be located in the 1st cache
line.

As you said, the seqn is a pure sw stuff right: it is set in a lib, not
in a PMD rx path.
quoted
- m->next, m->nb_segs, and m->refcnt are always initialized for
mbufs in the pool, avoiding the need of setting m->next (located in
the 2nd cache line) in the Rx path for mono-segment packets.
- change port and nb_segs to 16 bits  
Not that I am completely against it,
but changing nb_segs to 16 bits seems like an overkill to me.
I think we can keep and extra 8bits for something more useful in
future.
In my case, I use the m->next field to chain more than 256 segments for
L4 socket buffers. It also updates nb_seg that can overflow. It's not
a big issue since at the end, nb_seg is decremented for each segment.
On the other hand, if I enable some sanity checks on mbufs, it
complains because the number of segments is not equal to nb_seg.

There is also another use case with fragmentation as discussed recently:
http://dpdk.org/dev/patchwork/patch/19819/

Of course, dealing with a long mbuf list is not that efficient,
but the application can maintain another structure to accelerate the
access to the middle/end of the list.

Finally, we have other ideas to get additional 8 bits if required in
the future, so I don't think it's really a problem.

quoted
- move seqn in the 2nd cache line

Things discussed but not done in the patchset:
- move refcnt and nb_segs to the 2nd cache line: many drivers sets
  them in the Rx path, so it could introduce a performance
regression, or  
I wonder can refcnt only be moved into the 2-nd cacheline?
As I understand thanks to other change (from above) m->refcnt 
will already be initialized, so RX code don't need to touch it.
Though yes, it still would require changes in all PMDs.
Yes, I agree, some fields could be moved in the 2nd cache line once all
PMDs stop to write them in RX path. I propose to issue some guidelines
to PMD maintainers at the same time the patchset is pushed. Then we can
consider changing it in a future version, in case we need more room in
the 1st mbuf cache line.
quoted
  it would require to change all the drivers, which is not an easy
task.
- remove the m->port field: too much impact on many examples and
libraries, and some people highlighted they are using it.  
Ok, but can it be moved into the second cache-line?
I think no: it is set by the PMDs in RX path, it would impact
performance.

quoted
- moving m->next in the 1st cache line: there is not enough room,
and having it set to NULL for unused mbuf should remove the need
for it.
- merge seqn and timestamp together in a union: we could imagine
use cases were both are activated. There is no flag indicating the
presence of seqn, so it looks preferable to keep them separated for
now.

I made some basic performance tests (ixgbe) and see no regression,
but the patchset requires more testing.

[1] http://dpdk.org/ml/archives/dev/2016-October/049338.html
  
By the way, additional performance tests on this patchset from PMD
vendors would be helpful.


Olivier
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help