Re: [GIT PULL] xfs: new code for 5.15
From: Dennis Zhou <dennis@kernel.org>
Date: 2021-09-03 18:41:03
Also in:
linux-fsdevel, lkml
Hello, On Thu, Sep 02, 2021 at 08:47:42AM -0700, Linus Torvalds wrote:
On Tue, Aug 31, 2021 at 2:18 PM Darrick J. Wong [off-list ref] wrote:quoted
As for new features: we now batch inode inactivations in percpu background threads, which sharply decreases frontend thread wait time when performing file deletions and should improve overall directory tree deletion times.So no complaints on this one, but I do have a reaction: we have a lot of these random CPU hotplug events, and XFS now added another one. I don't see that as a problem, but just the _randomness_ of these callbacks makes me go "hmm". And that "enum cpuhp_state" thing isn't exactly a thing of beauty, and just makes me think there's something nasty going on. For the new xfs usage, I really get the feeling that it's not that XFS actually cares about the CPU states, but that this is literally tied to just having percpu state allocated and active, and that maybe it would be sensible to have something more specific to that kind of use. We have other things that are very similar in nature - like the page allocator percpu caches etc, which for very similar reasons want cpu dead/online notification. I'm only throwing this out as a reaction to this - I'm not sure another interface would be good or worthwhile, but that "enum cpuhp_state" is ugly enough that I thought I'd rope in Thomas for CPU hotplug, and the percpu memory allocation people for comments. IOW, just _maybe_ we would want to have some kind of callback model for "percpu_alloc()" and it being explicitly about allocations becoming available or going away, rather than about CPU state. Comments?
I think there are 2 pieces here from percpu's side: A) Onlining and offlining state related to a percpu alloc. B) Freeing backing memory of an allocation wrt hot plug. An RFC was sent out for B) in [1] and you need A) for B). I can see percpu having a callback model for basic allocations that are independent, but for anything more complex, that subsystem would need to register with hotplug anyway. It appears percpu_counter already has hot plug support. percpu_refcount could be extended as well, but more complex initialization like the runqueues and slab related allocations would require work. In short, yes I think A) is doable/reasonable. Freeing the backing memory for A) seems trickier. We would have to figure out a clean way to handle onlining/offlining racing with new percpu allocations (adding or removing pages for the corresponding cpu's chunk). To support A), init and onlining/offlining can be separate phases, but for B) init/freeing would have to be rolled into onlining/offlining. Without freeing, it's not incorrect for_each_online_cpu() to read a dead cpu's percpu values, but with freeing it does. I guess to summarize, A) seems like it might be a good idea with init/destruction happening at allocation/freeing times. I'm a little skeptical of B) in terms of complexity. If y'all think it's a good idea I can look into it again. [1] https://lore.kernel.org/lkml/20210601065147.53735-1-bharata@linux.ibm.com/ (local) Thanks, Dennis
quoted
Lastly, with this release, two new features have graduated to supported status: inode btree counters (for faster mounts), and support for dates beyond Y2038.Oh, I had thought Y2038 was already a non-issue for xfs. Silly me. Linus