Thread (50 messages) 50 messages, 7 authors, 2015-11-05

Re: [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist

From: Joonsoo Kim <hidden>
Date: 2015-11-05 05:09:18
Also in: linux-mm

On Wed, Oct 21, 2015 at 09:57:09AM +0200, Jesper Dangaard Brouer wrote:
On Wed, 14 Oct 2015 14:15:25 +0900
Joonsoo Kim [off-list ref] wrote:
quoted
On Tue, Sep 29, 2015 at 05:48:26PM +0200, Jesper Dangaard Brouer wrote:
quoted
This change focus on improving the speed of object freeing in the
"slowpath" of kmem_cache_free_bulk.

The calls slab_free (fastpath) and __slab_free (slowpath) have been
extended with support for bulk free, which amortize the overhead of
the (locked) cmpxchg_double.

To use the new bulking feature, we build what I call a detached
freelist.  The detached freelist takes advantage of three properties:

 1) the free function call owns the object that is about to be freed,
    thus writing into this memory is synchronization-free.

 2) many freelist's can co-exist side-by-side in the same slab-page
    each with a separate head pointer.

 3) it is the visibility of the head pointer that needs synchronization.

Given these properties, the brilliant part is that the detached
freelist can be constructed without any need for synchronization.  The
freelist is constructed directly in the page objects, without any
synchronization needed.  The detached freelist is allocated on the
stack of the function call kmem_cache_free_bulk.  Thus, the freelist
head pointer is not visible to other CPUs.

All objects in a SLUB freelist must belong to the same slab-page.
Thus, constructing the detached freelist is about matching objects
that belong to the same slab-page.  The bulk free array is scanned is
a progressive manor with a limited look-ahead facility.
[...]

quoted
Hello, Jesper.

AFAIK, it is uncommon to clear pointer to object in argument array.
At least, it is better to comment it on somewhere.
In this case, I think clearing the array is a good thing, as
using/referencing objects after they have been free'ed is a bug (which
can be hard to detect).
Okay.
quoted
Or, how about removing  lookahead facility? Does it have real benefit?
In my earlier patch series I had a version with and without lookahead
facility.  Just so I could benchmark the difference.  With Alex'es help
we/I tuned the code with the lookahead feature to be just as fast.
Thus, I merged the two patches. (Also did testing for worstcase [1])

I do wonder if the lookahead have any real benefit.  In micro
benchmarking it might be "just-as-fast", but I do suspect (just the code
size increase) it can affect real use-cases... Should we remove it?

[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test03.c
If it's not implemented yet, I would say that starting with simple
one first. But, now, we already have well implemented one so we don't
need to remove it. :)

Thanks.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help