Re: [PATCH v3] random: use expired per-cpu timer rather than wq for mixing fast pool

(off-list ancestor, not in this archive)
Re: 10% regression in qperf tcp latency after introducing commit "4a61bf7f9b18 random: defer fast pool mixing to worker" · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-21
Re: 10% regression in qperf tcp latency after introducing commit "4a61bf7f9b18 random: defer fast pool mixing to worker" · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-21
Re: 10% regression in qperf tcp latency after introducing commit "4a61bf7f9b18 random: defer fast pool mixing to worker" · Tejun Heo <tj@kernel.org> · 2022-09-21
Re: 10% regression in qperf tcp latency after introducing commit "4a61bf7f9b18 random: defer fast pool mixing to worker" · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-22
[PATCH] random: use tasklet rather than workqueue for mixing fast pool · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-22
[PATCH v2] random: use immediate per-cpu timer rather than workqueue for mixing fast pool · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-26
RE: [PATCH v2] random: use immediate per-cpu timer rather than workqueue for mixing fast pool · David Laight <hidden> · 2022-09-27
Re: [PATCH v2] random: use immediate per-cpu timer rather than workqueue for mixing fast pool · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-27
[PATCH v3] random: use expired per-cpu timer rather than wq for mixing fast pool · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-27
Re: [PATCH v3] random: use expired per-cpu timer rather than wq for mixing fast pool · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2022-09-28
Re: [PATCH v3] random: use expired per-cpu timer rather than wq for mixing fast pool · "Jason A. Donenfeld" <Jason@zx2c4.com> · 2022-09-28
Re: [PATCH v3] random: use expired per-cpu timer rather than wq for mixing fast pool · Sebastian Andrzej Siewior <bigeasy@linutronix.de> · 2022-09-29
Re: 10% regression in qperf tcp latency after introducing commit "4a61bf7f9b18 random: defer fast pool mixing to worker" · Sebastian Siewior <bigeasy@linutronix.de> · 2022-09-28

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: 2022-09-28 12:06:52
Also in: lkml, stable

On 2022-09-27 12:42:33 [+0200], Jason A. Donenfeld wrote:
…

This is an ordinary pattern done all over the kernel. However, Sherry
noticed a 10% performance regression in qperf TCP over a 40gbps
InfiniBand card. Quoting her message:

quoted

MT27500 Family [ConnectX-3] cards:
Infiniband device 'mlx4_0' port 1 status:

…

While looking at the mlx4 driver, it looks like they don't use any NAPI
handling in their interrupt handler which _might_ be the case that they
handle more than 1k interrupts a second. I'm still curious to get that
ACKed from Sherry's side.

Jason, from random's point of view: deferring until 1k interrupts + 1sec
delay is not desired due to low entropy, right?

Rather than incur the scheduling latency from queue_work_on, we can
instead switch to running on the next timer tick, on the same core. This
also batches things a bit more -- once per jiffy -- which is okay now
that mix_interrupt_randomness() can credit multiple bits at once.

Hmmm. Do you see higher contention on input_pool.lock? Just asking
because if more than once CPUs invokes this timer callback aligned, then
they block on the same lock.

Sebastian

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help