Thread (10 messages) 10 messages, 3 authors, 2025-01-31

Re: [PATCH v3 1/1] powerpc: Enable dynamic preemption

From: Shrikanth Hegde <hidden>
Date: 2025-01-31 06:54:54
Also in: lkml


On 1/31/25 11:39, Christophe Leroy wrote:

Le 30/01/2025 à 21:26, Sebastian Andrzej Siewior a écrit :
quoted
On 2025-01-30 22:27:07 [+0530], Shrikanth Hegde wrote:
quoted
quoted
| #DEFINE need_irq_preemption() \
|          
(static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
|
|     if (need_irq_preemption()) {

be a bit smaller/ quicker? This could be a fast path ;)
I am okay with either way. I did try both[1], there wasn't any 
significant difference,
hence chose a simpler one. May be system size, workload pattern might 
matter.

Let me do some more testing to see which one wins.
Is there any specific benchmark which might help here?
No idea. As per bean counting: preempt_model_preemptible() should
resolve in two function calls + conditional in the dynamic case. This
should be more expensive compared to a nop/ branch ;)
I did a build comparison on pmac32_defconfig + dynamic preemption, the 
static branch looks definitely more optimal:

Without static branch:

...
  294:    7c 08 02 a6     mflr    r0
  298:    90 01 00 24     stw     r0,36(r1)
  29c:    48 00 00 01     bl      29c <interrupt_exit_kernel_prepare+0x50>
             29c: R_PPC_REL24    preempt_model_full
  2a0:    2c 03 00 00     cmpwi   r3,0
  2a4:    41 82 00 a4     beq     348 <interrupt_exit_kernel_prepare+0xfc>
...
  2a8:    81 22 00 4c     lwz     r9,76(r2)
  2ac:    71 29 00 04     andi.   r9,r9,4
  2b0:    40 82 00 d4     bne     384 <interrupt_exit_kernel_prepare+0x138>
...
  2b4:    7d 20 00 a6     mfmsr   r9
  2b8:    55 29 07 fa     rlwinm  r9,r9,0,31,29
...
  348:    48 00 00 01     bl      348 <interrupt_exit_kernel_prepare+0xfc>
             348: R_PPC_REL24    preempt_model_lazy
  34c:    2c 03 00 00     cmpwi   r3,0
  350:    40 a2 ff 58     bne     2a8 <interrupt_exit_kernel_prepare+0x5c>
  354:    4b ff ff 60     b       2b4 <interrupt_exit_kernel_prepare+0x68>

With the static branch:

...
  294:    48 00 00 58     b       2ec <interrupt_exit_kernel_prepare+0xa0>
...
  298:    7d 20 00 a6     mfmsr   r9
  29c:    55 29 07 fa     rlwinm  r9,r9,0,31,29
...
  2ec:    81 22 00 4c     lwz     r9,76(r2)
  2f0:    71 29 00 04     andi.   r9,r9,4
  2f4:    41 82 ff a4     beq     298 <interrupt_exit_kernel_prepare+0x4c>
...

With the static branch it is just an unconditional branch being later 
replaced by a NOP. It must be more efficient.

So in first case we need to get and save LR, call preempt_model_full(), 
compare with 0, call preempt_model_lazy() and compare again, and at some 
point read and restore LR.

And the preempt_model_() functions are not tiny functions:

00003620 <preempt_model_full>:
     3620:    3d 20 00 00     lis     r9,0
             3622: R_PPC_ADDR16_HA    preempt_dynamic_mode
     3624:    94 21 ff f0     stwu    r1,-16(r1)
     3628:    80 69 00 00     lwz     r3,0(r9)
             362a: R_PPC_ADDR16_LO    preempt_dynamic_mode
     362c:    2c 03 ff ff     cmpwi   r3,-1
     3630:    41 82 00 18     beq     3648 <preempt_model_full+0x28>
     3634:    68 63 00 02     xori    r3,r3,2
     3638:    38 21 00 10     addi    r1,r1,16
     363c:    7c 63 00 34     cntlzw  r3,r3
     3640:    54 63 d9 7e     srwi    r3,r3,5
     3644:    4e 80 00 20     blr
...
     3648:    0f e0 00 00     twui    r0,0
     364c:    68 63 00 02     xori    r3,r3,2
     3650:    38 21 00 10     addi    r1,r1,16
     3654:    7c 63 00 34     cntlzw  r3,r3
     3658:    54 63 d9 7e     srwi    r3,r3,5
     365c:    4e 80 00 20     blr

00003660 <preempt_model_lazy>:
     3660:    3d 20 00 00     lis     r9,0
             3662: R_PPC_ADDR16_HA    preempt_dynamic_mode
     3664:    94 21 ff f0     stwu    r1,-16(r1)
     3668:    80 69 00 00     lwz     r3,0(r9)
             366a: R_PPC_ADDR16_LO    preempt_dynamic_mode
     366c:    2c 03 ff ff     cmpwi   r3,-1
     3670:    41 82 00 18     beq     3688 <preempt_model_lazy+0x28>
     3674:    68 63 00 03     xori    r3,r3,3
     3678:    38 21 00 10     addi    r1,r1,16
     367c:    7c 63 00 34     cntlzw  r3,r3
     3680:    54 63 d9 7e     srwi    r3,r3,5
     3684:    4e 80 00 20     blr
...
     3688:    0f e0 00 00     twui    r0,0
     368c:    68 63 00 03     xori    r3,r3,3
     3690:    38 21 00 10     addi    r1,r1,16
     3694:    7c 63 00 34     cntlzw  r3,r3
     3698:    54 63 d9 7e     srwi    r3,r3,5
     369c:    4e 80 00 20     blr

That's convincing. Thanks for the pointers. I will move it into static 
branch method.

That makes the code same for arm64/powerpc. Maybe its worth moving it 
into sched/core. Let me try that as well.

PS: on leave next week, so will be sending the patches after i get back.
quoted
But you would still need preempt_model_preemptible() for the !DYN case.
quoted
quoted
quoted
+           preempt_model_voluntary() ? "voluntary" :
+           preempt_model_full()      ? "full" :
+           preempt_model_lazy()      ? "lazy" :
+           "",
So intend to rework this part. I have patches stashed at
    https://eur01.safelinks.protection.outlook.com/? 
url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbigeasy%2Fstaging.git%2Flog%2F%3Fh%3Dpreemption_string&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C553a8f640a9c4514597d08dd416c6534%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638738655988556429%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ZfVQi1iRYaVCTrZ5vIhOQ7yAKaDnOwFi0NRjycfrI5A%3D&reserved=0

which I didn't sent yet due to the merge window. Just a heads up ;)
Makes sense. I had seen at-least two places where this code was 
there, ftrace/powerpc.
There were way more places..

You want me to remove this part?
No, just be aware.
I don't know how this will be routed I guess we merge the sched pieces
first and then I submit the other pieces via the relevant maintainer
tree. In that case please be aware that all parts get removed/ replaced
properly.

Sebastian
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help