Re: Optimizing instruction-cache, more packets at each stage

Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Hannes Frederic Sowa <hidden> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-15
RE: Optimizing instruction-cache, more packets at each stage · David Laight <hidden> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Felix Fietkau <hidden> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · Florian Fainelli <f.fainelli@gmail.com> · 2016-01-25
Re: Optimizing instruction-cache, more packets at each stage · David Miller <davem@davemloft.net> · 2016-01-15
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · David Miller <davem@davemloft.net> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · Or Gerlitz <hidden> · 2016-01-20
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-20
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-20
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Or Gerlitz <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · David Miller <davem@davemloft.net> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Or Gerlitz <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · David Miller <davem@davemloft.net> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · David Miller <davem@davemloft.net> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-24
Re: Optimizing instruction-cache, more packets at each stage · "Michael S. Tsirkin" <mst@redhat.com> · 2016-01-24
Re: Optimizing instruction-cache, more packets at each stage · John Fastabend <john.fastabend@gmail.com> · 2016-01-24
Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Tom Herbert <hidden> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · John Fastabend <john.fastabend@gmail.com> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Tom Herbert <hidden> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · John Fastabend <john.fastabend@gmail.com> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-25
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-27
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Alexei Starovoitov <hidden> · 2016-01-27
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Eric Dumazet <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Eric Dumazet <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Tom Herbert <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Tom Herbert <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Eric Dumazet <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Tom Herbert <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Eric Dumazet <hidden> · 2016-01-28
Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) · Jesper Dangaard Brouer <hidden> · 2016-01-28
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-24
Re: Optimizing instruction-cache, more packets at each stage · John Fastabend <john.fastabend@gmail.com> · 2016-01-24
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-24
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-21
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-22
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-22
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-22
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-22
Re: Optimizing instruction-cache, more packets at each stage · Or Gerlitz <hidden> · 2016-02-02
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-02-02
Re: Optimizing instruction-cache, more packets at each stage · Eric Dumazet <hidden> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · Tom Herbert <hidden> · 2016-01-18
Re: Optimizing instruction-cache, more packets at each stage · Jesper Dangaard Brouer <hidden> · 2016-01-18

From: Eric Dumazet <hidden>
Date: 2016-01-21 17:48:39

On Thu, 2016-01-21 at 08:38 -0800, Tom Herbert wrote:

Sure, but the receive path is parallelized.

This is true for multiqueue processing, assuming you can dedicate many
cores to process RX.

 Improving parallelism has
continuously shown to have much more impact than attempting to
optimize for cache misses. The primary goal is not to drive 100Gbps
with 64 packets from a single CPU. It is one benchmark of many we
should look at to measure efficiency of the data path, but I've yet to
see any real workload that requires that...

Regardless of anything, we need to load packet headers into CPU cache
to do protocol processing. I'm not sure I see how trying to defer that
as long as possible helps except in cases where the packet is crossing
CPU cache boundaries and can eliminate cache misses completely (not
just move them around from one function to another).

Note that some user space use multiple core (or hyper threads) to
implement a pipeline, using a single RX queue.

One thread can handle one stage (device RX drain) and prefetch data into
shared L1/L2 (and/or shared L3 for pipelines with more than 2 threads)

The second thread process packets with headers already in L1/L2

This way, the ~100 ns (or even more if you also consider skb
allocations) penalty to bring packet headers do not hurt PPS.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help