Thread (63 messages) 63 messages, 7 authors, 2021-07-29

Re: [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT compatible

From: Vlastimil Babka <hidden>
Date: 2021-07-18 07:42:44
Also in: lkml

On 7/3/21 9:24 AM, Mike Galbraith wrote:
On Fri, 2021-07-02 at 20:29 +0200, Sebastian Andrzej Siewior wrote:
quoted
I replaced my slub changes with slub-local-lock-v2r3.
I haven't seen any complains from lockdep or so which is good. Then I
did this with RT enabled (and no debug):
Below is some raw hackbench data from my little i4790 desktop box.  It
says we'll definitely still want list_lock to be raw.
Hi Mike, thanks a lot for the testing, sorry for late reply.

Did you try, instead of raw list_lock, not applying the last, local lock
patch, as I suggested in reply to bigeasy? I think the impact at
reducing the RT-specific overhead would be larger (than raw list_lock),
the result should still be RT compatible, and it would also deal with
the bugs you found there... (which I'll look into).

Thanks,
Vlastimil
It also appears to be saying that there's something RT specific to
stare at in addition to the list_lock business, but add a pinch of salt
to that due to the config of the virgin(ish) tip tree being much
lighter than the enterprise(ish) config of the tip-rt tree.

perf stat -r10 hackbench -s4096 -l500
full warmup, record, repeat twice for elapsed

5.13.0.g60ab3ed-tip-rt
          8,898.51 msec task-clock                #    7.525 CPUs utilized            ( +-  0.33% )
           368,922      context-switches          #    0.041 M/sec                    ( +-  5.20% )
            42,281      cpu-migrations            #    0.005 M/sec                    ( +-  5.28% )
            13,180      page-faults               #    0.001 M/sec                    ( +-  0.70% )
    33,343,378,867      cycles                    #    3.747 GHz                      ( +-  0.30% )
    21,656,783,887      instructions              #    0.65  insn per cycle           ( +-  0.67% )
     4,408,569,663      branches                  #  495.428 M/sec                    ( +-  0.73% )
        12,040,125      branch-misses             #    0.27% of all branches          ( +-  2.93% )

           1.18260 +- 0.00473 seconds time elapsed  ( +-  0.40% )
           1.19018 +- 0.00441 seconds time elapsed  ( +-  0.37% ) (repeat)
           1.18260 +- 0.00473 seconds time elapsed  ( +-  0.40% ) (repeat)

5.13.0.g60ab3ed-tip-rt +slub-local-lock-v2r3 list_lock=raw_spinlock_t
          9,642.00 msec task-clock                #    7.521 CPUs utilized            ( +-  0.46% )
           462,091      context-switches          #    0.048 M/sec                    ( +-  4.79% )
            44,411      cpu-migrations            #    0.005 M/sec                    ( +-  4.34% )
            12,980      page-faults               #    0.001 M/sec                    ( +-  0.43% )
    36,098,859,429      cycles                    #    3.744 GHz                      ( +-  0.44% )
    25,462,853,462      instructions              #    0.71  insn per cycle           ( +-  0.50% )
     5,260,898,360      branches                  #  545.623 M/sec                    ( +-  0.52% )
        16,088,686      branch-misses             #    0.31% of all branches          ( +-  2.02% )

           1.28207 +- 0.00568 seconds time elapsed  ( +-  0.44% )
           1.28744 +- 0.00713 seconds time elapsed  ( +-  0.55% ) (repeat)
           1.28085 +- 0.00850 seconds time elapsed  ( +-  0.66% ) (repeat)

5.13.0.g60ab3ed-tip-rt +slub-local-lock-v2r3 list_lock=spinlock_t
         10,004.89 msec task-clock                #    6.029 CPUs utilized            ( +-  1.37% )
           654,311      context-switches          #    0.065 M/sec                    ( +-  5.16% )
           211,070      cpu-migrations            #    0.021 M/sec                    ( +-  1.38% )
            13,262      page-faults               #    0.001 M/sec                    ( +-  0.79% )
    36,585,914,931      cycles                    #    3.657 GHz                      ( +-  1.35% )
    27,682,240,511      instructions              #    0.76  insn per cycle           ( +-  1.06% )
     5,766,064,432      branches                  #  576.325 M/sec                    ( +-  1.11% )
        24,269,069      branch-misses             #    0.42% of all branches          ( +-  2.03% )

            1.6595 +- 0.0116 seconds time elapsed  ( +-  0.70% )
            1.6270 +- 0.0180 seconds time elapsed  ( +-  1.11% ) (repeat)
            1.6213 +- 0.0150 seconds time elapsed  ( +-  0.93% ) (repeat)

virgin(ish) tip
5.13.0.g60ab3ed-tip
          7,320.67 msec task-clock                #    7.792 CPUs utilized            ( +-  0.31% )
           221,215      context-switches          #    0.030 M/sec                    ( +-  3.97% )
            16,234      cpu-migrations            #    0.002 M/sec                    ( +-  4.07% )
            13,233      page-faults               #    0.002 M/sec                    ( +-  0.91% )
    27,592,205,252      cycles                    #    3.769 GHz                      ( +-  0.32% )
     8,309,495,040      instructions              #    0.30  insn per cycle           ( +-  0.37% )
     1,555,210,607      branches                  #  212.441 M/sec                    ( +-  0.42% )
         5,484,209      branch-misses             #    0.35% of all branches          ( +-  2.13% )

           0.93949 +- 0.00423 seconds time elapsed  ( +-  0.45% )
           0.94608 +- 0.00384 seconds time elapsed  ( +-  0.41% ) (repeat)
           0.94422 +- 0.00410 seconds time elapsed  ( +-  0.43% )

5.13.0.g60ab3ed-tip +slub-local-lock-v2r3
          7,343.57 msec task-clock                #    7.776 CPUs utilized            ( +-  0.44% )
           223,044      context-switches          #    0.030 M/sec                    ( +-  3.02% )
            16,057      cpu-migrations            #    0.002 M/sec                    ( +-  4.03% )
            13,164      page-faults               #    0.002 M/sec                    ( +-  0.97% )
    27,684,906,017      cycles                    #    3.770 GHz                      ( +-  0.45% )
     8,323,273,871      instructions              #    0.30  insn per cycle           ( +-  0.28% )
     1,556,106,680      branches                  #  211.901 M/sec                    ( +-  0.31% )
         5,463,468      branch-misses             #    0.35% of all branches          ( +-  1.33% )

           0.94440 +- 0.00352 seconds time elapsed  ( +-  0.37% )
           0.94830 +- 0.00228 seconds time elapsed  ( +-  0.24% ) (repeat)
           0.93813 +- 0.00440 seconds time elapsed  ( +-  0.47% ) (repeat)
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help