Re: [PATCH v2] btrfs: add goto in btrfs_defrag_file for error handling
From: David Sterba <hidden>
Date: 2021-05-17 13:03:10
On Wed, May 05, 2021 at 03:40:52PM -0700, Boris Burkov wrote:
On Wed, May 05, 2021 at 09:26:28AM +0800, Tian Tao wrote:quoted
ret is assigned -EAGAIN at line 1455 and then reassigned defrag_count at line 1547 after exiting the while loop.this causes the btrfs_defrag_file function to not return the correct value in the event of an error, this patch fixed this issue.This looks like a correct fix, in that it locally improves what it claims to improve. However, I have some questions about the style and consistency of the function as a whole as a result. I think Dave had a similar comment in his very first reply on v1. The loop has the following early exit points: fs unmounted cancellation swapfile/error in cluster_pages_for_defrag newer_off == (u64)-1 error (ENOMEM or ENOENT) in find_new_extents To me, it is confusing that of all these, only cancellation goes to a label called "error". I would expect at least the swapfile/cluster case to also jump to error. find_new_extents is interesting, because ENOENT could be semantically special as an error and warrant a break rather than a goto error, so we ought to figure that out correctly too. If there is some good reason that only cancellation should receive this treatment, and that some early exit cases should break or goto out_ra then I would at least name the new label "cancel" and write a comment or a note in the git commit explaining the difference.
The naming convention of the exit labels describes what happens at the label point and not the reason, as the label can be targeted from various branches but the same clanup is done. The naming is not consistent everywhere, but at least that's the idea.
Thinking out loud, I suspect a way to really fix this messy function is to do something like lift the contents of the while loop into a helper function which returns the next incremental defrag_count, an error, or 0 for done.
Reading it again with the above in mind, there are two types of errors to end the defrag: - if some defrag work has been done but not entire file was processed - the rest, eg. some hard errors In the first case the optional flushing should still happen. In both cases the incompat bits should be set -- this is now missing. I'm not sure if the whole while loop could be factored out, there's a lot of shared state with the function. The different kinds of errors would have to be reflected too but that's doable. As this patch fixes the return value of canceled defrag, I'd take it as is and address the other issues separately.