Thread (94 messages) 94 messages, 17 authors, 2008-11-04

Re: [tbench regression fixes]: digging out smelly deadmen.

From: Mike Galbraith <hidden>
Date: 2008-10-11 18:14:28
Also in: lkml

On Sat, 2008-10-11 at 16:39 +0200, Peter Zijlstra wrote:
That said, we can probably still avoid the division for the top level
stuff, because the sum of the top level weights is still invariant
between all tasks.
Less math would be nice of course...
I'll have a stab at doing so... I initially didn't do this because my
first try gave some real ugly code, but we'll see - these numbers are a
very convincing reason to try again.
...but the numbers I get on Q6600 don't pin the tail on the math donkey.

Update to UP test log.

2.6.27-final-up
ring-test   - 1.193 us/cycle  = 838 KHz  (gcc-4.3)
tbench      - 337.377 MB/sec           tso/gso on
tbench      - 340.362 MB/sec           tso/gso off
netperf     - 120751.30 rr/s           tso/gso on
netperf     - 121293.48 rr/s           tso/gso off

2.6.27-final-up
patches/revert_weight_and_asym_stuff.diff
ring-test   - 1.133 us/cycle  = 882 KHz  (gcc-4.3)
tbench      - 340.481 MB/sec           tso/gso on
tbench      - 343.472 MB/sec           tso/gso off
netperf     - 119486.14 rr/s           tso/gso on
netperf     - 121035.56 rr/s           tso/gso off

2.6.28-up
ring-test   - 1.149 us/cycle  = 870 KHz  (gcc-4.3)
tbench      - 343.681 MB/sec           tso/gso off
netperf     - 122812.54 rr/s           tso/gso off

My SMP log, updated to account for TSO/GSO monkey-wrench.

(<bleep> truckload of time <bleep> wasted chasing unbisectable
<bleepity-bleep> tso gizmo. <bleep!>)

SMP config, same as UP kernels tested, except SMP.

tbench -t 60 4 localhost followed by four 60 sec netperf
TCP_RR pairs, each pair on it's own core of my Q6600.

2.6.22.19

Throughput 1250.73 MB/sec 4 procs                  1.00

16384  87380  1        1       60.01    111272.55  1.00
16384  87380  1        1       60.00    104689.58
16384  87380  1        1       60.00    110733.05
16384  87380  1        1       60.00    110748.88

2.6.22.19-cfs-v24.1

Throughput 1213.21 MB/sec 4 procs                  .970

16384  87380  1        1       60.01    108569.27  .992
16384  87380  1        1       60.01    108541.04
16384  87380  1        1       60.00    108579.63
16384  87380  1        1       60.01    108519.09

2.6.23.17

Throughput 1200.46 MB/sec 4 procs                  .959

16384  87380  1        1       60.01    95987.66   .866
16384  87380  1        1       60.01    92819.98
16384  87380  1        1       60.01    95454.00
16384  87380  1        1       60.01    94834.84

2.6.23.17-cfs-v24.1

Throughput 1238.68 MB/sec 4 procs                  .990

16384  87380  1        1       60.01    105871.52  .969
16384  87380  1        1       60.01    105813.11
16384  87380  1        1       60.01    106106.31
16384  87380  1        1       60.01    106310.20

2.6.24.7

Throughput 1204 MB/sec 4 procs                     .962

16384  87380  1        1       60.00    99599.27   .910
16384  87380  1        1       60.00    99439.95
16384  87380  1        1       60.00    99556.38
16384  87380  1        1       60.00    99500.45

2.6.25.17

Throughput 1223.16 MB/sec 4 procs                  .977
16384  87380  1        1       60.00    101768.95  .930
16384  87380  1        1       60.00    101888.46
16384  87380  1        1       60.01    101608.21
16384  87380  1        1       60.01    101833.05

2.6.26.5

Throughput 1183.47 MB/sec 4 procs                  .945

16384  87380  1        1       60.00    100837.12  .922
16384  87380  1        1       60.00    101230.12
16384  87380  1        1       60.00    100868.45
16384  87380  1        1       60.00    100491.41

numbers above here are gcc-4.1, below gcc-4.3

2.6.26.6

Throughput 1177.18 MB/sec 4 procs

16384  87380  1        1       60.00    100896.10
16384  87380  1        1       60.00    100028.16
16384  87380  1        1       60.00    101729.44
16384  87380  1        1       60.01    100341.26

TSO/GSO off

2.6.27-final

Throughput 1177.39 MB/sec 4 procs

16384  87380  1        1       60.00    98830.65
16384  87380  1        1       60.00    98722.47
16384  87380  1        1       60.00    98565.17
16384  87380  1        1       60.00    98633.03

2.6.27-final
patches/revert_weight_and_asym_stuff.diff

Throughput 1167.67 MB/sec 4 procs

16384  87380  1        1       60.00    97003.05
16384  87380  1        1       60.00    96758.42
16384  87380  1        1       60.00    96432.01
16384  87380  1        1       60.00    97060.98

2.6.28.git

Throughput 1173.14 MB/sec 4 procs

16384  87380  1        1       60.00    98449.33
16384  87380  1        1       60.00    98484.92
16384  87380  1        1       60.00    98657.98
16384  87380  1        1       60.00    98467.39


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help