Thread (50 messages) 50 messages, 11 authors, 2021-06-09

Re: [PATCH v4 00/15] Add futex2 syscalls

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: 2021-06-08 14:58:01
Also in: linux-api, lkml

On 2021-06-08 16:23:45 [+0200], Peter Zijlstra wrote:
There's more futex users than glibc, and some of them are really hurting
because of the NUMA issue. Oracle used to (I've no idea what they do or
do not do these days) use sysvsem because the futex hash table was a
massive bottleneck for them.

And as Nick said, other vendors are having the same problems.
I just wanted to do a brief summary of last events. The implementation
tglx did with the cookie resulting in a quick lookup did not have any
downsides except that the user-API had to change glibc couldn't. So if
we are back to square one why not start with that.
And if you don't extend the futex to store the nid you put the waiter in
(see all the problems above) you will have to do wakeups on all nodes,
which is both slower than it is today, and scales possibly even worse.

The whole numa-aware qspinlock saga is in part because of futex.
sure.
That said; if we're going to do the whole futex-vector thing, we really
do need a new interface, because the futex multiplex monster is about to
crumble (see the fun wrt timeouts for example).
This might have been a series of unfortunate events leading to this. The
sad part is that glibc has a comment that the kernel does not support
this and nobody bother to change it (until recently).
And if we're going to do a new interface, we ought to make one that can
solve all these problems. Now, ideally glibc will bring forth some
opinions, but if they don't want to play, we'll go back to the good old
days of non-standard locking libraries.. we're halfway there already due
to glibc not wanting to break with POSIX were we know POSIX was just
dead wrong broken.

See: https://github.com/dvhart/librtpi
I'm aware of that, I hacked on it, too :) This was the unfortunate
result of a ~8y old bug which was not fixed instead and part of the code
was rewritten and a bit-spinlock was added in user-land. You may
remember the discussion regarding spins in userland…
That said, REQUEUE_PI is no longer used by glibc.

Sebastian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help