RE: frontswap/zcache: xvmalloc discussion

From: Dan Magenheimer <hidden>
Date: 2011-06-30 02:31:47

quoted

One neat feature of frontswap (and the underlying Transcendent
Memory definition) is that ANY PUT may be rejected**.  So zcache
could keep track of the distribution of "zsize" and if the number
of pages with zsize>PAGE_SIZE/2 greatly exceeds the number of pages
with "complementary zsize", the frontswap code in zcache can reject
the larger pages until balance/sanity is restored.

Might that help?

We could do that, but I imagine that would let a lot of pages through
on most workloads.  Ideally, I'd like to find a solution that would
capture and (efficiently) store pages that compressed to up to 80% of
their original size.

After thinking about this a bit, I have to disagree.  For workloads
where the vast majority of pages have zsize>PAGE_SIZE/2, this would
let a lot of pages through.  So if you are correct that LZO
is poor at compression and a large majority of pages are in
this category, some page-crossing scheme is necessary.  However,
that isn't what I've seen... the zsize of many swap pages is
quite small.

So before commencing on a major compression rewrite, it might
be a good idea to measure distribution of zsize for swap pages
on a large variety of workloads.  This could probably be done
by adding a code snippet in the swap path of a normal (non-zcache)
kernel.  And if the distribution is bad, replacing LZO with a
higher-compression-but-slower algorithm might be the best answer,
since zcache is replacing VERY slow swap-device reads/writes with
reasonably fast compression/decompression.  I certainly think
that an algorithm approaching an average 50% compression ratio
should be the goal.

FWIW, I've measured the distribution of zsize (pages compressed
with frontswap) on my favorite workload (kernel "make -j2" on
mem=512M to force lots of swapping) and the mean is small, close
to 1K (PAGE_SIZE/4).  I've added some sysfs shows for both
the current and cumulative distribution (0-63 bytes, 64-127
bytes, ..., 4032-4095 bytes) for the next update.

I tried your program on the text of Moby Dick and the mean
was still under 1500 bytes ((3*PAGE_SIZE)/8) with a good
broad distribution for zsize.  I tried your program also on
gzip'ed Moby Dick and zcache correctly rejects most of the
pages as uncompressible and does fine on other swapped pages.

So I can't reproduce what you are seeing.  Somehow you
must create and swap a set of pages with a zsize distribution
almost entirely between PAGE_SIZE/2 and (PAGE_SIZE*7)/8.
How did you do that?

FYI, I also added a sysfs settable for zv_max_page_size...
if zsize exceeds it, the page is rejected.  It defaults to
(PAGE_SIZE*7)/8, which was the non-settable hardwired
value before.

Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help