Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists

[PATCH 0/7] Further optimizing SLAB/SLUB bulking · Jesper Dangaard Brouer <hidden> · 2015-09-28
[PATCH 1/7] slub: create new ___slab_alloc function that can be called with irqs disabled · Jesper Dangaard Brouer <hidden> · 2015-09-28
[PATCH 2/7] slub: Avoid irqoff/on in bulk allocation · Jesper Dangaard Brouer <hidden> · 2015-09-28
[PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG · Christoph Lameter <hidden> · 2015-09-28
[PATCH 4/7] slab: implement bulking for SLAB allocator · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 4/7] slab: implement bulking for SLAB allocator · Christoph Lameter <hidden> · 2015-09-28
[PATCH 5/7] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Christoph Lameter <hidden> · 2015-09-28
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Christoph Lameter <hidden> · 2015-09-28
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-29
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Christoph Lameter <hidden> · 2015-09-28
Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-29
[PATCH 6/7] slub: optimize bulk slowpath free by detached freelist · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist · Christoph Lameter <hidden> · 2015-09-28
[PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() · Jesper Dangaard Brouer <hidden> · 2015-09-28
Re: [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() · Alexander Duyck <hidden> · 2015-09-28
Re: [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() · Jesper Dangaard Brouer <hidden> · 2015-09-28
[MM PATCH V4 0/6] Further optimizing SLAB/SLUB bulking · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4 1/6] slub: create new ___slab_alloc function that can be called with irqs disabled · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4 2/6] slub: Avoid irqoff/on in bulk allocation · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4 3/6] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4 4/6] slab: implement bulking for SLAB allocator · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-29
Re: [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists · Alexander Duyck <hidden> · 2015-09-29
Re: [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-29
Re: [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists · Alexander Duyck <hidden> · 2015-09-29
Re: [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-29
[MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-09-30
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Christoph Lameter <hidden> · 2015-09-30
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Andrew Morton <akpm@linux-foundation.org> · 2015-10-01
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-02
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Christoph Lameter <hidden> · 2015-10-02
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-02
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-02
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Andrew Morton <akpm@linux-foundation.org> · 2015-10-02
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-05
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Andi Kleen <hidden> · 2015-10-05
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-05
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-07
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Arnaldo Carvalho de Melo <hidden> · 2015-10-07
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Andi Kleen <hidden> · 2015-10-07
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Andi Kleen <hidden> · 2015-10-07
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-05
Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists · Jesper Dangaard Brouer <hidden> · 2015-10-07
[MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist · Jesper Dangaard Brouer <hidden> · 2015-09-29
Re: [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist · Joonsoo Kim <hidden> · 2015-10-14
Re: [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist · Jesper Dangaard Brouer <hidden> · 2015-10-21
Re: [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist · Joonsoo Kim <hidden> · 2015-11-05

From: Andi Kleen <hidden>
Date: 2015-10-05 21:20:47
Also in: linux-mm

My only problem left, is I want a perf measurement that pinpoint these
kind of spots.  The difference in L1-icache-load-misses were significant
(1,278,276 vs 2,719,158).  I tried to somehow perf record this with
different perf events without being able to pinpoint the location (even
though I know the spot now).  Even tried Andi's ocperf.py... maybe he
will know what event I should try?

Run pmu-tools toplev.py -l3 with --show-sample. It tells you what the
bottle neck is and what to sample for if there is a suitable event and
even prints the command line.

https://github.com/andikleen/pmu-tools/wiki/toplev-manual#sampling-with-toplev

However frontend issues are difficult to sample, as they happen very far
away from instruction retirement where the sampling happens. So you may
have large skid and the sampling points may be far away. Skylake has new
special FRONTEND_* PEBS events for this, but before it was often difficult. 

BTW if your main goal is icache; I wrote a gcc patch to help the kernel
by enabling function splitting: Apply the patch in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890 to gcc 5,
make sure 9bebe9e5b0f (now in mainline) is applied and build with
-freorder-blocks-and-partition. That will split all functions into
statically predicted hot and cold parts and generally relieves
icache pressure. Any testing of this on your workload welcome.

-Andi

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help