Thread (13 messages) 13 messages, 8 authors, 2021-09-06

Re: [GIT PULL] xfs: new code for 5.15

From: Dennis Zhou <dennis@kernel.org>
Date: 2021-09-03 18:41:03
Also in: linux-fsdevel, lkml

Hello,

On Thu, Sep 02, 2021 at 08:47:42AM -0700, Linus Torvalds wrote:
On Tue, Aug 31, 2021 at 2:18 PM Darrick J. Wong [off-list ref] wrote:
quoted
As for new features: we now batch inode inactivations in percpu
background threads, which sharply decreases frontend thread wait time
when performing file deletions and should improve overall directory tree
deletion times.
So no complaints on this one, but I do have a reaction: we have a lot
of these random CPU hotplug events, and XFS now added another one.

I don't see that as a problem, but just the _randomness_ of these
callbacks makes me go "hmm". And that "enum cpuhp_state" thing isn't
exactly a thing of beauty, and just makes me think there's something
nasty going on.

For the new xfs usage, I really get the feeling that it's not that XFS
actually cares about the CPU states, but that this is literally tied
to just having percpu state allocated and active, and that maybe it
would be sensible to have something more specific to that kind of use.

We have other things that are very similar in nature - like the page
allocator percpu caches etc, which for very similar reasons want cpu
dead/online notification.

I'm only throwing this out as a reaction to this - I'm not sure
another interface would be good or worthwhile, but that "enum
cpuhp_state" is ugly enough that I thought I'd rope in Thomas for CPU
hotplug, and the percpu memory allocation people for comments.

IOW, just _maybe_ we would want to have some kind of callback model
for "percpu_alloc()" and it being explicitly about allocations
becoming available or going away, rather than about CPU state.

Comments?
I think there are 2 pieces here from percpu's side:
A) Onlining and offlining state related to a percpu alloc.
B) Freeing backing memory of an allocation wrt hot plug.

An RFC was sent out for B) in [1] and you need A) for B).
I can see percpu having a callback model for basic allocations that are
independent, but for anything more complex, that subsystem would need to
register with hotplug anyway. It appears percpu_counter already has hot
plug support. percpu_refcount could be extended as well, but more
complex initialization like the runqueues and slab related allocations
would require work. In short, yes I think A) is doable/reasonable.

Freeing the backing memory for A) seems trickier. We would have to
figure out a clean way to handle onlining/offlining racing with new
percpu allocations (adding or removing pages for the corresponding cpu's
chunk). To support A), init and onlining/offlining can be separate
phases, but for B) init/freeing would have to be rolled into
onlining/offlining.

Without freeing, it's not incorrect for_each_online_cpu() to read a dead
cpu's percpu values, but with freeing it does.

I guess to summarize, A) seems like it might be a good idea with
init/destruction happening at allocation/freeing times. I'm a little
skeptical of B) in terms of complexity. If y'all think it's a good idea
I can look into it again.

[1] https://lore.kernel.org/lkml/20210601065147.53735-1-bharata@linux.ibm.com/ (local)

Thanks,
Dennis
quoted
Lastly, with this release, two new features have graduated to supported
status: inode btree counters (for faster mounts), and support for dates
beyond Y2038.
Oh, I had thought Y2038 was already a non-issue for xfs. Silly me.

              Linus
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help