Thread (59 messages) 59 messages, 9 authors, 2016-03-23

Suspicious error for CMA stress test

From: Joonsoo Kim <hidden>
Date: 2016-03-14 14:10:45
Also in: linux-mm, lkml

2016-03-14 21:30 GMT+09:00 Vlastimil Babka [off-list ref]:
On 03/14/2016 08:18 AM, Joonsoo Kim wrote:
quoted
On Mon, Mar 14, 2016 at 08:06:16AM +0100, Vlastimil Babka wrote:
quoted
On 03/14/2016 07:49 AM, Joonsoo Kim wrote:
quoted
On Fri, Mar 11, 2016 at 06:07:40PM +0100, Vlastimil Babka wrote:
quoted
On 03/11/2016 04:00 PM, Joonsoo Kim wrote:

How about something like this? Just and idea, probably buggy
(off-by-one etc.).
Should keep away cost from <pageblock_order iterations@the expense
of the
relatively fewer >pageblock_order iterations.

Hmm... I tested this and found that it's code size is a little bit
larger than mine. I'm not sure why this happens exactly but I guess it
would be
related to compiler optimization. In this case, I'm in favor of my
implementation because it looks like well abstraction. It adds one
unlikely branch to the merge loop but compiler would optimize it to
check it once.

I would be surprised if compiler optimized that to check it once, as
order increases with each loop iteration. But maybe it's smart
enough to do something like I did by hand? Guess I'll check the
disassembly.

Okay. I used following slightly optimized version and I need to
add 'max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1)'
to yours. Please consider it, too.

Hmm, so this is bloat-o-meter on x86_64, gcc 5.3.1. CONFIG_CMA=y

next-20160310 vs my patch (with added min_t as you pointed out):
add/remove: 0/0 grow/shrink: 1/1 up/down: 69/-5 (64)
function                                     old     new   delta
free_one_page                                833     902     +69
free_pcppages_bulk                          1333    1328      -5

next-20160310 vs your patch:
add/remove: 0/0 grow/shrink: 2/0 up/down: 577/0 (577)
function                                     old     new   delta
free_one_page                                833    1187    +354
free_pcppages_bulk                          1333    1556    +223

my patch vs your patch:
add/remove: 0/0 grow/shrink: 2/0 up/down: 513/0 (513)
function                                     old     new   delta
free_one_page                                902    1187    +285
free_pcppages_bulk                          1328    1556    +228

The increase of your version is surprising, wonder what the compiler did.
Otherwise I would like simpler/maintainable version, but this is crazy.
Can you post your results? I wonder if your compiler e.g. decided to stop
inlining page_is_buddy() or something.
Now I see why this happen. I enabled CONFIG_DEBUG_PAGEALLOC
and it makes difference.

I tested on x86_64, gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4.

With CONFIG_CMA + CONFIG_DEBUG_PAGEALLOC
./scripts/bloat-o-meter page_alloc_base.o page_alloc_vlastimil_orig.o
add/remove: 0/0 grow/shrink: 2/0 up/down: 510/0 (510)
function                                     old     new   delta
free_one_page                               1050    1334    +284
free_pcppages_bulk                          1396    1622    +226

./scripts/bloat-o-meter page_alloc_base.o page_alloc_mine.o
add/remove: 0/0 grow/shrink: 2/0 up/down: 351/0 (351)
function                                     old     new   delta
free_one_page                               1050    1230    +180
free_pcppages_bulk                          1396    1567    +171


With CONFIG_CMA + !CONFIG_DEBUG_PAGEALLOC
(pa_b is base, pa_v is yours and pa_m is mine)

./scripts/bloat-o-meter pa_b.o pa_v.o
add/remove: 0/0 grow/shrink: 1/1 up/down: 88/-23 (65)
function                                     old     new   delta
free_one_page                                761     849     +88
free_pcppages_bulk                          1117    1094     -23

./scripts/bloat-o-meter pa_b.o pa_m.o
add/remove: 0/0 grow/shrink: 2/0 up/down: 329/0 (329)
function                                     old     new   delta
free_one_page                                761    1031    +270
free_pcppages_bulk                          1117    1176     +59

Still, it has difference but less than before.
Maybe, we are still using different configuration. Could you
check if CONFIG_DEBUG_VM is enabled or not? In my case, it's not
enabled. And, do you think this bloat isn't acceptable?

Thanks.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help