Re: qdisc spin lock

From: Michael Ma <hidden>
Date: 2016-04-21 05:51:08

2016-04-20 15:34 GMT-07:00 Eric Dumazet [off-list ref]:

On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote:

quoted

2016-04-08 7:19 GMT-07:00 Eric Dumazet [off-list ref]:

quoted

On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote:

quoted

I didn't really know that multiple qdiscs can be isolated using MQ so
that each txq can be associated with a particular qdisc. Also we don't
really have multiple interfaces...

With this MQ solution we'll still need to assign transmit queues to
different classes by doing some math on the bandwidth limit if I
understand correctly, which seems to be less convenient compared with
a solution purely within HTB.

I assume that with this solution I can still share qdisc among
multiple transmit queues - please let me know if this is not the case.

Note that this MQ + HTB thing works well, unless you use a bonding
device. (Or you need the MQ+HTB on the slaves, with no way of sharing
tokens between the slaves)

Actually MQ+HTB works well for small packets - like flow of 512 byte
packets can be throttled by HTB using one txq without being affected
by other flows with small packets. However I found using this solution
large packets (10k for example) will only achieve very limited
bandwidth. In my test I used MQ to assign one txq to a HTB which sets
rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by
using 30 threads. But sending 10k packets using 10 threads has only 10
Mbit/s with the same TC configuration. If I increase burst and cburst
of HTB to some extreme large value (like 50MB) the ceiling rate can be
hit.

The strange thing is that I don't see this problem when using HTB as
the root. So txq number seems to be a factor here - however it's
really hard to understand why would it only affect larger packets. Is
this a known issue? Any suggestion on how to investigate the issue
further? Profiling shows that the cpu utilization is pretty low.

You could try

perf record -a -g -e skb:kfree_skb sleep 5
perf report

So that you see where the packets are dropped.

Chances are that your UDP sockets SO_SNDBUF is too big, and packets are
dropped at qdisc enqueue time, instead of having backpressure.

Thanks for the hint - how should I read the perf report? Also we're
using TCP socket in this testing - TCP window size is set to 70kB.

-  35.88%             init  [kernel.kallsyms]  [k] intel_idle
                                                   ◆
     intel_idle
                                                   ▒
-  15.83%          strings  libc-2.5.so        [.]
__GI___connect_internal
▒
   - __GI___connect_internal
                                                   ▒
      - 50.00% get_mapping
                                                   ▒
           __nscd_get_map_ref
                                                   ▒
        50.00% __nscd_open_socket
                                                   ▒
-  13.19%          strings  libc-2.5.so        [.] __GI___libc_recvmsg
                                                   ▒
   - __GI___libc_recvmsg
                                                   ▒
      + 64.52% getifaddrs
                                                   ▒
      + 35.48% __check_pf
                                                   ▒
-  10.55%          strings  libc-2.5.so        [.] __sendto_nocancel
                                                   ▒
   - __sendto_nocancel
                                                   ▒
        100.00% 0

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help