Thread (49 messages) 49 messages, 8 authors, 2017-02-02

Re: [PATCH v7 11/12] zsmalloc: page migration support

From: Minchan Kim <minchan@kernel.org>
Date: 2017-01-23 05:22:47
Subsystem: block layer, the rest, zram compressed ram block device drvier · Maintainers: Jens Axboe, Linus Torvalds, Minchan Kim, Sergey Senozhatsky

Hi Chulmin,

On Thu, Jan 19, 2017 at 03:16:11AM -0500, Chulmin Kim wrote:
On 01/19/2017 01:21 AM, Minchan Kim wrote:
quoted
On Wed, Jan 18, 2017 at 10:39:15PM -0500, Chulmin Kim wrote:
quoted
On 01/18/2017 09:44 PM, Minchan Kim wrote:
quoted
Hello Chulmin,

On Wed, Jan 18, 2017 at 07:13:21PM -0500, Chulmin Kim wrote:
quoted
Hello. Minchan, and all zsmalloc guys.

I have a quick question.
Is zsmalloc considering memory barrier things correctly?

AFAIK, in ARM64,
zsmalloc relies on dmb operation in bit_spin_unlock only.
(It seems that dmb operations in spinlock functions are being prepared,
but let is be aside as it is not merged yet.)

If I am correct,
migrating a page in a zspage filled with free objs
may cause the corruption cause bit_spin_unlock will not be executed at all.

I am not sure this is enough memory barrier for zsmalloc operations.

Can you enlighten me?
Do you mean bit_spin_unlock is broken or zsmalloc locking scheme broken?
Could you please describe what you are concerning in detail?
It would be very helpful if you say it with a example!
Sorry for ambiguous expressions. :)

Recently,
I found multiple zsmalloc corruption cases which have garbage idx values in
in zspage->freeobj. (not ffffffff (-1) value.)

Honestly, I have no clue yet.

I suspect the case when zspage migrate a zs sub page filled with free
objects (so that never calls unpin_tag() which has memory barrier).


Assume the page (zs subpage) being migrated has no allocated zs object.

S : zs subpage
D : free page


CPU A : zs_page_migrate()		CPU B : zs_malloc()
---------------------			-----------------------------


migrate_write_lock()
spin_lock()

memcpy(D, S, PAGE_SIZE)   -> (1)
replace_sub_page()

putback_zspage()
spin_unlock()
migrate_write_unlock()
				
				spin_lock()
				obj_malloc()
				--> (2-a) allocate obj in D
				--> (2-b) set freeobj using
    						the first 8 bytes of
						the allocated obj
				record_obj()
				spin_unlock



I think the locking has no problem, but memory ordering.
I doubt whether (2-b) in CPU B really loads the data stored by (1).

If it doesn't, set_freeobj in (2-b) will corrupt zspage->freeobj.
After then, we will see corrupted object sooner or later.
Thanks for the example.
When I cannot understand what you are pointing out.

In above example, two CPU use same spin_lock of a class so store op
by memcpy in the critical section should be visible by CPU B.

Am I missing your point?

No, you are right.
I just pointed it prematurely after only checking that arm64's spinlock
seems not issue "dmb" operation explicitly.
I am the one missed the basics.

Anyway, I will let you know the situation when it gets more clear.
Yeb, Thanks.

Perhaps, did you tried flush page before the writing?
I think arm64 have no d-cache alising problem but worth to try it.
Who knows :)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 46da1c4..a3a5520 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -612,6 +612,8 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	unsigned long element;
 
 	page = bvec->bv_page;
+	flush_dcache_page(page);
+
 	if (is_partial_io(bvec)) {
 		/*
 		 * This is a partial IO. We need to read the full page

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help