Re: Possible leak during reshaping layout

From: Kenny Root <hidden>
Date: 2014-07-21 15:16:43

On Mon, Jul 21, 2014 at 05:26:51PM +1000, NeilBrown wrote:

On Sat, 19 Jul 2014 22:27:00 -0700 Kenny Root [off-list ref] wrote:

quoted

I may have stumbled into a kernel memory leak during reshaping of a RAID 10
from offset to near layout:

...

quoted

      OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
    60511744 60511219  29%    0.25K 2183366       32  17466928K kmalloc-256
    193408  82391  42%    0.06K   3022       64     12088K kmalloc-64
    154880 129949  83%    0.03K   1210      128      4840K kmalloc-32
    154624 152783  98%    0.01K    302      512      1208K kmalloc-8
    144160 143412  99%    0.02K    848      170      3392K fsnotify_event_holder
    125103  34053  27%    0.08K   2453       51      9812K selinux_inode_security

This very suspicious.
As you might imagine, it is not possible for a slab to use more memory than
is physically available.
It claims there are 60511219 active objects out of a total of 60511744.
I calculate that as 99.9999132%, but it suggests 29%.

If there were 32 OBJ/SLAB, then the slabs must be 8K.  This is possible, but
they are 4K on my machine, and all the other slabs you listed are too.

I've tried a similar reshape on 3.16-rc3 and there is no similar leak.

The only patch since 3.13 that could possibly be relevant is

commit cc13b1d1500656a20e41960668f3392dda9fa6e2
Author: NeilBrown [off-list ref]
Date:   Mon May 5 13:34:37 2014 +1000

    md/raid10: call wait_barrier() for each request submitted.

That might fix a leak.  However the leak it might fix was introduced in
3.14-rc1:
    commit 20d0189b1012a37d2533a87fb451f7852f2418d1
        block: Introduce new bio_split()

So unless Fedora backported one of those but not the other I don't see how
this can be caused by RAID10.

What does /proc/slabinfo contain?  Maybe "slabtop" is presenting it poorly.

I had to restart the machine shortly after this, because it became
pretty unresponsive. However, it still has about a gigabyte of memory
hanging around after the reshape finished:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
5184320 5183608  99%    0.25K 162010       32   1296080K kmalloc-256

Here are the kmallocs from slabinfo the same time:

slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
...
kmalloc-8192         152    152   8192    4    8 : tunables    0    0    0 : slabdata     38     38      0
kmalloc-4096         830    832   4096    8    8 : tunables    0    0    0 : slabdata    104    104      0
kmalloc-2048        1078   1184   2048   16    8 : tunables    0    0    0 : slabdata     74     74      0
kmalloc-1024        2704   2752   1024   32    8 : tunables    0    0    0 : slabdata     86     86      0
kmalloc-512         4176   4288    512   32    4 : tunables    0    0    0 : slabdata    134    134      0
kmalloc-256       5183621 5184320    256   32    2 : tunables    0    0    0 : slabdata 162010 162010      0
kmalloc-192        13157  13356    192   21    1 : tunables    0    0    0 : slabdata    636    636      0
kmalloc-128        11576  11712    128   32    1 : tunables    0    0    0 : slabdata    366    366      0
kmalloc-96         12558  12558     96   42    1 : tunables    0    0    0 : slabdata    299    299      0
kmalloc-64         99344 100672     64   64    1 : tunables    0    0    0 : slabdata   1573   1573      0
kmalloc-32        132317 135040     32  128    1 : tunables    0    0    0 : slabdata   1055   1055      0
kmalloc-16         61696  61696     16  256    1 : tunables    0    0    0 : slabdata    241    241      0
kmalloc-8          88064  88064      8  512    1 : tunables    0    0    0 : slabdata    172    172      0

I did try to run ftrace during the reshape to see where the allocations
were being made. One allocation callsite was in bio_alloc_bioset and the
other appeared to be beyond the range of my Symbols.map.

I'll try to reproduce it in a VM with the Fedora kernel and then the
vanilla kernel to see if it's a problem with Fedora first.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help