Re: [PATCH v7 11/12] zsmalloc: page migration support

[PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 01/12] mm: use put_page to free page instead of putback_lru_page · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 03/12] mm: balloon: use general non-lru movable page feature · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 05/12] zsmalloc: use bit_spin_lock · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 04/12] zsmalloc: keep max_object in size_class · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 07/12] zsmalloc: factor page chain functionality out · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2016-05-31
Re: [PATCH v7 11/12] zsmalloc: page migration support · Vlastimil Babka <hidden> · 2016-06-01
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2016-06-02
Re: [PATCH v7 11/12] zsmalloc: page migration support · Vlastimil Babka <hidden> · 2016-06-02
Re: [PATCH v7 11/12] zsmalloc: page migration support · Andrew Morton <akpm@linux-foundation.org> · 2016-06-01
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2016-06-02
Re: [PATCH v7 11/12] zsmalloc: page migration support · Chulmin Kim <hidden> · 2017-01-19
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-19
Re: [PATCH v7 11/12] zsmalloc: page migration support · Chulmin Kim <hidden> · 2017-01-19
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-19
Re: [PATCH v7 11/12] zsmalloc: page migration support · Chulmin Kim <hidden> · 2017-01-19
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-23
Re: [PATCH v7 11/12] zsmalloc: page migration support · Sergey Senozhatsky <hidden> · 2017-01-23
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-23
Re: [PATCH v7 11/12] zsmalloc: page migration support · Chulmin Kim <hidden> · 2017-01-25
Re: [PATCH v7 11/12] zsmalloc: page migration support · Sergey Senozhatsky <hidden> · 2017-01-25
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-25
Re: [PATCH v7 11/12] zsmalloc: page migration support · Dan Streetman <hidden> · 2017-01-26
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-01-31
Re: [PATCH v7 11/12] zsmalloc: page migration support · Dan Streetman <hidden> · 2017-01-31
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-02-01
Re: [PATCH v7 11/12] zsmalloc: page migration support · Dan Streetman <hidden> · 2017-02-01
Re: [PATCH v7 11/12] zsmalloc: page migration support · Minchan Kim <minchan@kernel.org> · 2017-02-02
[PATCH v7 09/12] zsmalloc: separate free_zspage from putback_zspage · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 08/12] zsmalloc: introduce zspage structure · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 12/12] zram: use __GFP_MOVABLE for memory allocation · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 10/12] zsmalloc: use freeobj for index · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 06/12] zsmalloc: use accessor · Minchan Kim <minchan@kernel.org> · 2016-05-31
[PATCH v7 02/12] mm: migrate: support non-lru movable page migration · Minchan Kim <minchan@kernel.org> · 2016-05-31
Re: [PATCH v7 00/12] Support non-lru page migration · Andrew Morton <akpm@linux-foundation.org> · 2016-06-01
Re: [PATCH v7 00/12] Support non-lru page migration · Daniel Vetter <hidden> · 2016-06-01
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-02
Re: [PATCH v7 00/12] Support non-lru page migration · Sergey Senozhatsky <hidden> · 2016-06-15
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-15
Re: [PATCH v7 00/12] Support non-lru page migration · Sergey Senozhatsky <hidden> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Sergey Senozhatsky <hidden> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Sergey Senozhatsky <hidden> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Sergey Senozhatsky <hidden> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Minchan Kim <minchan@kernel.org> · 2016-06-16
Re: [PATCH v7 00/12] Support non-lru page migration · Joonsoo Kim <hidden> · 2016-06-17

From: Minchan Kim <minchan@kernel.org>
Date: 2017-01-23 05:22:47
Subsystem: block layer, the rest, zram compressed ram block device drvier · Maintainers: Jens Axboe, Linus Torvalds, Minchan Kim, Sergey Senozhatsky

Hi Chulmin,

On Thu, Jan 19, 2017 at 03:16:11AM -0500, Chulmin Kim wrote:

On 01/19/2017 01:21 AM, Minchan Kim wrote:

quoted

On Wed, Jan 18, 2017 at 10:39:15PM -0500, Chulmin Kim wrote:

quoted

On 01/18/2017 09:44 PM, Minchan Kim wrote:

quoted

Hello Chulmin,

On Wed, Jan 18, 2017 at 07:13:21PM -0500, Chulmin Kim wrote:

quoted

Hello. Minchan, and all zsmalloc guys.

I have a quick question.
Is zsmalloc considering memory barrier things correctly?

AFAIK, in ARM64,
zsmalloc relies on dmb operation in bit_spin_unlock only.
(It seems that dmb operations in spinlock functions are being prepared,
but let is be aside as it is not merged yet.)

If I am correct,
migrating a page in a zspage filled with free objs
may cause the corruption cause bit_spin_unlock will not be executed at all.

I am not sure this is enough memory barrier for zsmalloc operations.

Can you enlighten me?

Do you mean bit_spin_unlock is broken or zsmalloc locking scheme broken?
Could you please describe what you are concerning in detail?
It would be very helpful if you say it with a example!

Sorry for ambiguous expressions. :)

Recently,
I found multiple zsmalloc corruption cases which have garbage idx values in
in zspage->freeobj. (not ffffffff (-1) value.)

Honestly, I have no clue yet.

I suspect the case when zspage migrate a zs sub page filled with free
objects (so that never calls unpin_tag() which has memory barrier).


Assume the page (zs subpage) being migrated has no allocated zs object.

S : zs subpage
D : free page


CPU A : zs_page_migrate()		CPU B : zs_malloc()
---------------------			-----------------------------


migrate_write_lock()
spin_lock()

memcpy(D, S, PAGE_SIZE)   -> (1)
replace_sub_page()

putback_zspage()
spin_unlock()
migrate_write_unlock()
				
				spin_lock()
				obj_malloc()
				--> (2-a) allocate obj in D
				--> (2-b) set freeobj using
    						the first 8 bytes of
						the allocated obj
				record_obj()
				spin_unlock



I think the locking has no problem, but memory ordering.
I doubt whether (2-b) in CPU B really loads the data stored by (1).

If it doesn't, set_freeobj in (2-b) will corrupt zspage->freeobj.
After then, we will see corrupted object sooner or later.

Thanks for the example.
When I cannot understand what you are pointing out.

In above example, two CPU use same spin_lock of a class so store op
by memcpy in the critical section should be visible by CPU B.

Am I missing your point?


No, you are right.
I just pointed it prematurely after only checking that arm64's spinlock
seems not issue "dmb" operation explicitly.
I am the one missed the basics.

Anyway, I will let you know the situation when it gets more clear.

Yeb, Thanks.

Perhaps, did you tried flush page before the writing?
I think arm64 have no d-cache alising problem but worth to try it.
Who knows :)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 46da1c4..a3a5520 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c

@@ -612,6 +612,8 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	unsigned long element;
 
 	page = bvec->bv_page;
+	flush_dcache_page(page);
+
 	if (is_partial_io(bvec)) {
 		/*
 		 * This is a partial IO. We need to read the full page

--

To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help