Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
From: Yury Norov <yury.norov@gmail.com>
Date: 2021-12-02 00:31:55
Also in:
kvm, linux-alpha, linux-crypto, linux-mips, linux-perf-users, linux-riscv, linux-s390, linuxppc-dev, lkml
On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov [off-list ref] napisał/a:quoted
On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:quoted
On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:quoted
In many cases people use bitmap_weight()-based functions like this: if (num_present_cpus() > 1) do_something(); This may take considerable amount of time on many-cpus machines because num_present_cpus() will traverse every word of underlying cpumask unconditionally. We can significantly improve on it for many real cases if stop traversing the mask as soon as we count present cpus to any number greater than 1: if (num_present_cpus_gt(1)) do_something(); To implement this idea, the series adds bitmap_weight_{eq,gt,le} functions together with corresponding wrappers in cpumask and nodemask.Having slept on it I have more structured thoughts: First, I like substituting bitmap_empty/full where possible - I think the change stands on its own, so could be split and sent as is.Ok, I can do it.quoted
I don't like the proposed API very much. One problem is that it hides the comparison operator and makes call sites less readable: bitmap_weight(...) > N becomes: bitmap_weight_gt(..., N) and: bitmap_weight(...) <= N becomes: bitmap_weight_lt(..., N+1) or: !bitmap_weight_gt(..., N) I'd rather see something resembling memcmp() API that's known enough to be easier to grasp. For above examples: bitmap_weight_cmp(..., N) > 0 bitmap_weight_cmp(..., N) <= 0 ...bitmap_weight_cmp() cannot be efficient. Consider this example: bitmap_weight_lt(1000 0000 0000 0000, 1) == false ^ stop here bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0 ^ stop here I agree that '_gt' is less verbose than '>', but the advantage of '_gt' over '>' is proportional to length of bitmap, and it means that this API should exist.Thank you for the example. Indeed, for less-than to be efficient here you would need to replace bitmap_weight_cmp(..., N) < 0 with bitmap_weight_cmp(..., N-1) <= 0
Indeed, thanks for pointing to it.
It would still be more readable, I think.
To be honest, I'm not sure that
bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
bitmap_weight(...) < N
comparing to
bitmap_weight_lt(..., N)
I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.
I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections.
Thanks,
Yury