Thread (23 messages) 23 messages, 2 authors, 2016-10-26

Re: [PATCH v5 4/8] md/r5cache: write part of r5cache

From: NeilBrown <hidden>
Date: 2016-10-19 00:53:21

On Fri, Oct 14 2016, Song Liu wrote:
quoted
On Oct 13, 2016, at 11:53 PM, NeilBrown [off-list ref] wrote:

On Thu, Oct 13 2016, Song Liu wrote:
quoted
For RMW, the code allocates an extra page for each data block
being updated.  This is stored in r5dev->page and the old data
is read into it.  Then the prexor calculation subtracts ->page
from the parity block, and the reconstruct calculation adds the
->orig_page data back into the parity block.
What happens if the alloc_page() fails?
That will be tough, but solvable.. We can
    read old data to page
    do prexor 
    read new data from journal device to page
    do xor 
    do the rest of the work. 

Or we can force the code to rcw, which does not need extra page. 
But rcw, does not always work in degraded mode. So, this is a good 
reason not to do write-back in degraded mode...
Prohibiting write-back in degraded mode would not be enough to ensure
that you can always use rcw.  The array can become degraded after you
make the decision to use caching, and before to need to read old data
for rmw.

I would suggest a small (2 entry?) mempool where each entry in the
mempool holds enough pages to complete an rmw.  Only use the mempool if
an alloc_page() fails.
quoted
quoted
+
+void r5c_handle_cached_data_endio(struct r5conf *conf,
+	  struct stripe_head *sh, int disks, struct bio_list *return_bi)
+{
+	int i;
+
+	for (i = sh->disks; i--; ) {
+		if (test_bit(R5_InCache, &sh->dev[i].flags) &&
+		    sh->dev[i].written) {
Is it possible for R5_InCache to be set, but 'written' to be NULL ???
Yes, it is possible. A stripe may go through "write data to journal, return IO" multiple
times before parity calculation. When it comes here the second time, dev written in the 
first time will have R5_InCache set, but its written will be NULL. 
OK, that makes sense.
So is it possible for 'written' to be set, but R5_InCache to be clear?
i.e. do we really need to test R5_InCache here?
quoted
quoted
static void r5l_io_run_stripes(struct r5l_io_unit *io)
@@ -483,7 +566,8 @@ static int r5l_log_stripe(struct r5l_log *log, struct stripe_head *sh,
	io = log->current_io;

	for (i = 0; i < sh->disks; i++) {
-		if (!test_bit(R5_Wantwrite, &sh->dev[i].flags))
+		if (!test_bit(R5_Wantwrite, &sh->dev[i].flags) &&
+		    !test_bit(R5_Wantcache, &sh->dev[i].flags))
			continue;
If changed R5_Wantcache to R5_Wantjournal, and always set it on blocks
that were destined for the journal, then this would just be

		if (!test_bit(R5_Wantjournal, &sh->dev[i].flags))

which would make lots of sense...  Just a thought.
We set R5_Wantwrite in multiple places. If we simplify the code here, we will need to make
those places aware of journal. I guess that is not ideal either? 
Maybe...
We have so many state flags that I like to be very cautious about adding
more, and to make sure they have a very well defined meaning that
doesn't overlap with other flags too much.
The above code suggests that Wantwrite and Wantcache overlap to some
extent.

Could we discard Wantcache and just use Wantwrite combined with InCache?
Wantwrite means that the block needed to be written to the RAID.
If InCache isn't set, it also needs to be written to the cache (if the
cache is being used).
Then the above code would be
   if (!test_bit(R5_Wantwrite) || test_bit(R5_InCache))
      continue;

which means "if we don't want to write this, or if it is already in the
cache, then nothing to do here".

Maybe.
 
quoted
quoted
}

-static void r5l_wake_reclaim(struct r5l_log *log, sector_t space)
+void r5l_wake_reclaim(struct r5l_log *log, sector_t space)
Why are you removing the 'static' here?  You don't call it from any
other file.
In next patch, it is called in raid.c.
So remove 'static' in the next patch please.


Thanks,
NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help