Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible

[PATCH v3 1/2] capability: add cap_isidentical · Mateusz Guzik <hidden> · 2023-01-25
[PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-01-25
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Christian Brauner <brauner@kernel.org> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Kees Cook <hidden> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Alexander Potapenko <glider@google.com> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
RE: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · David Laight <hidden> · 2023-03-05
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Yury Norov <yury.norov@gmail.com> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Yury Norov <yury.norov@gmail.com> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · David Vernet <void@manifault.com> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Yury Norov <yury.norov@gmail.com> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Yury Norov <yury.norov@gmail.com> · 2023-03-06
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Mateusz Guzik <hidden> · 2023-03-04
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Kees Cook <hidden> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Eric Biggers <ebiggers@kernel.org> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Christian Brauner <brauner@kernel.org> · 2023-03-03
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Al Viro <viro@zeniv.linux.org.uk> · 2023-03-02
Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible · Christian Brauner <brauner@kernel.org> · 2023-03-03
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Casey Schaufler <casey@schaufler-ca.com> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Mateusz Guzik <hidden> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Mateusz Guzik <hidden> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Linus Torvalds <torvalds@linux-foundation.org> · 2023-03-01
Re: [PATCH v3 1/2] capability: add cap_isidentical · "Serge E. Hallyn" <serge@hallyn.com> · 2023-02-28
Re: [PATCH v3 1/2] capability: add cap_isidentical · Casey Schaufler <casey@schaufler-ca.com> · 2023-02-28

From: Yury Norov <yury.norov@gmail.com>
Date: 2023-03-04 05:51:48
Also in: linux-fsdevel, lkml

On Fri, Mar 03, 2023 at 07:42:36PM -0800, Linus Torvalds wrote:

ing: quoted-printable
Status: O
Content-Length: 1593
Lines: 41

On Fri, Mar 3, 2023 at 7:25 PM Yury Norov [off-list ref] wrote:

quoted

Did you enable CONFIG_FORCE_NR_CPUS? If you pick it, the kernel will
bind nr_cpu_ids to NR_CPUS at compile time, and the memset() call
should disappear.

I do not believe CONFIG_FORCE_NR_CPUS makes any sense, and I think I
told you so at the time.

At that time you was OK with CONFIG_FORCE_NR_CPUS, only suggested to
hide it behind CONFIG_EXPERT:

https://lore.kernel.org/all/Yzx4fSmmr8bh6gdl@yury-laptop/T/#m92d405527636154c3b2000e0105379170d988315 (local)

This all used to just work *without* some kind of config thing, First
removing the automatic "do the right thing", and then adding a config
option to "force" doing the right thing seems more than a bit silly to
me.

I think CONFIG_FORCE_NR_CPUS should go away, and - once more - become
just the "is the cpumask small enough to be just allocated directly"
thing.

This all was just broken. For example, as I mentioned in commit message,
cpumask_full() was broken. I know because I wrote a test. There were no
a single user for the function, and nobody complained. Now we have one
in BPF code. So if we simply revert the aa47a7c215e, it will hurt real
users.

The pre-CONFIG_FORCE_NR_CPUS cpumask machinery would work only if you
set NR_CPUS to the number that matches to the actual number of CPUs as
detected at boot time.

In your example, if you have NR_CPUS == 64, and for some reason disable
hyper threading, nr_cpumask_bits will be set to 64 at compile time, but
nr_cpu_ids will be set to 32 at boot time, assuming
CONFIG_CPUMASK_OFFSTACK is disabled.

And the following code will be broken:

cpumask_t m1, m2;

cpumask_setall(m1); // m1 is ffff ffff ffff ffff because it uses
                    // compile-time optimized nr_cpumask_bits

for_each_cpu(cpu, m1) // 32 iterations because it relied on nr_cpu_ids
        cpumask_set_cpu(cpu, m2); // m2 is ffff ffff XXXX XXXX

BUG_ON(!cpumask_equal(m1, m2)); // Bug because it will test all 64 bits

Today with CONFIG_FORCE_NR_CPUS disabled, kernel consistently relies
on boot-time defined nr_cpu_ids in functions like cpumask_equal()
with the cost of disabled runtime optimizations.

If CONFIG_FORCE_NR_CPUS is enabled, it wires nr_cpu_ids to NR_CPUS
at compile time, which allows compile-time optimization.

If CONFIG_FORCE_NR_CPUS is enabled, but actual number of CPUs doesn't
match to NR_CPUS, the kernel throws a warning at boot time - better
than nothing.

I'm not happy bothering people with a new config parameter in such a
simple case. I just don't know how to fix it better. Is there a safe
way to teach compiler to optimize against NR_CPUS other than telling
it explicitly?

Of course, the problem for others remain that distros will do that
CONFIG_CPUMASK_OFFSTACK thing, and then things will suck regardless.

I was *so* happy with our clever "you can have large cpumasks, and
we'll just allocate them off the stack" long long ago, because it
meant that we could have one single source tree where this was all
cleanly abstracted away, and we even had nice types and type safety
for it all.

That meant that we could support all the fancy SGI machines with
several thousand cores, and it all "JustWorked(tm)", and didn't make
the normal case any worse.

I didn't expect distros to then go "ooh, we want that too", and enable
it all by default, and make all our clever "you only see this
indirection if you need it" go away, and now the normal case is the
*bad* case, unless you just build your own kernel and pick sane
defaults.

Oh well.

From distro people's perspective, 'one size fits all' is the best
approach. It's hard to blame them.

Thanks,
Yury

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help