Thread (14 messages) 14 messages, 3 authors, 2021-07-05

Re: [PATCH RFC 0/8] btrfs: experimental compression support for subpage

From: Qu Wenruo <hidden>
Date: 2021-06-24 01:10:16


On 2021/6/24 上午5:23, David Sterba wrote:
On Wed, Jun 23, 2021 at 01:55:21PM +0800, Qu Wenruo wrote:
quoted
This patchset is based on my previously submitted compression refactor,
which is further based on subpage patchset.

It can be fetched from the following repo:
https://github.com/adam900710/linux/tree/compression

It only has been lightly tested, no stress test/full fstests run yet.
The reason for such a unstable patchset is to get feedback on how to
continue the development, aka the feedback for the last patch.

The first 7 patches are mostly cleanups to make compression path to be
more subpage compatible.

The RFC is for the last patch, which will enable subpage compression in
a pretty controversial way.

The reason is copied from that patch:
This mechanism allows btrfs to continue handling next ranges, without
waiting for the time consuming compression.

But this has a problem for subpage case, as we could have the following
delalloc range for a page:

0		32K		64K
|	|///////|	|///////|
		\- A		\- B

In above case, if we pass both range to cow_file_range_async(), both
range A and range B will try to unlock the full page [0, 64K).

And which finishes later than the other range will try to do other page
operatioins like end_page_writeback() on a unlocked page, triggering VM
layer BUG_ON().

Currently I don't have any perfect solution to this, but two
workarounds:

- Only allow compression for fully page aligned range

   This is what I did in this patch.
   By this, the compressed range can exclusively lock the first page
   (and other pages), so they are completely safe to do whatever they
   want.
   The problem is, we will not compress a lot of small writes.
   This is especially problematic as our target page size is 64K, not
   a small size.

- Make cow_file_range_async() to behave like cow_file_range() for
   subpage

   This needs several behavier change, and are all subpage specific:
   * Skip the first page of the range when finished
     Just like cow_file_range()
   * Have a way to wait for the async_cow to finish before handling the
     next delalloc range

   The biggest problem here is the performance impact.
   Although by this we can compress all sector aligned ranges, we will
   waste time waiting for the async_cow to finish.
   This is completely denying the meaning of "async" part.
   Now to mention there are tons of code needs to be changed.

Thus I choose the current way to only compress ranges which is fully
page aligned.
The cost is we will skip a lot of small writes for 64K page size.
Any feedback on this part would be pretty helpful.
As another step, progressing towards full subpage support I think that
the limiting compression to the full 64k page is acceptable. Better than
nothing and without intrusive changes described above.

There still might be some odd case even with the whole page, I'm not
sure how exactly it would work with 4k/64k but there are some checks
inside compression anyway if the size is getting smaller and then
bailing out uncompressed anyway. So in general fallback to uncompressed
data happens already.
Well well well, just come by a new shinny and maybe perfect idea to make
it better.

- For async cow submission
   The main problem described above can be solved by
   btrfs_subpage::writers and a change in how we run delalloc range.

   Currently we find and lock a delalloc range, then run the locked range
   immediately.
   This makes any submitted async cow can finish before we reach next
   range, thus even we increase subage::writers it makes no sense.

   But if we find and lock *all* delalloc ranges inside a page first, and
   increase subpage::writers according to all the locked range, we can
   ensure the full page won't be unlocked until all delalloc ranges are
   submitted and finished.

   By this we can allow sector-perfect compressed write.

   And for exsting delalloc range handlers like cow_file_range(), we need
   to keep a bitmap to record which range is submitted asynchronized,
   then unlock the remaining ranges manually.

- The bio submission and delalloc range
   Above change makes me thinking, why we don't submit bio when we run
   delalloc range?

   When running delalloc range, we will need to find and lock all pages
   in the range any way.
   Why we can't do the submission at the same time, as we already need to
   do the page operations anyway.

   I guess there is some performance related reason, but I'm not sure.
   Any feedback on this part will be appreciated.

Despite the better solution, I'll still push the current full page
compression, not only as a quick stop gap, but all these preparation
patches will benefit later sector-perfect compression.

Thanks,
Qu
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help