[PATCH v3 00/22] kthread: Use kthread worker API more widely · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 01/22] timer: Allow to check when the timer callback has not finished yet · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 01/22] timer: Allow to check when the timer callback has not finished yet · Thomas Gleixner <hidden> · 2015-11-18
Re: [PATCH v3 01/22] timer: Allow to check when the timer callback has not finished yet · Petr Mladek <pmladek@suse.com> · 2015-11-19
[PATCH v3 02/22] kthread/smpboot: Do not park in kthread_create_on_cpu() · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 02/22] kthread/smpboot: Do not park in kthread_create_on_cpu() · Thomas Gleixner <hidden> · 2015-11-25
[PATCH v3 04/22] kthread: Add create_kthread_worker*() · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 03/22] kthread: Allow to call __kthread_create_on_node() with va_list args · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 06/22] kthread: Add destroy_kthread_worker() · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Tejun Heo <tj@kernel.org> · 2015-11-23
Re: [PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Petr Mladek <pmladek@suse.com> · 2015-11-24
Re: [PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Tejun Heo <tj@kernel.org> · 2015-11-24
Re: [PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Petr Mladek <pmladek@suse.com> · 2015-11-24
Re: [PATCH v3 07/22] kthread: Detect when a kthread work is used by more workers · Peter Zijlstra <peterz@infradead.org> · 2015-11-24
[PATCH v3 09/22] kthread: Allow to cancel kthread work · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 09/22] kthread: Allow to cancel kthread work · Tejun Heo <tj@kernel.org> · 2015-11-23
Re: [PATCH v3 09/22] kthread: Allow to cancel kthread work · Petr Mladek <pmladek@suse.com> · 2015-11-24
Re: [PATCH v3 09/22] kthread: Allow to cancel kthread work · Linus Torvalds <torvalds@linux-foundation.org> · 2015-11-24
Re: [PATCH v3 09/22] kthread: Allow to cancel kthread work · Tejun Heo <tj@kernel.org> · 2015-11-24
Re: [PATCH v3 09/22] kthread: Allow to cancel kthread work · Linus Torvalds <torvalds@linux-foundation.org> · 2015-11-24
[PATCH v3 10/22] kthread: Allow to modify delayed kthread work · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 11/22] kthread: Better support freezable kthread workers · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 15/22] hung_task: Convert hungtaskd into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 14/22] ring_buffer: Convert benchmark kthreads into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 17/22] ipmi: Convert kipmi kthread into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 17/22] ipmi: Convert kipmi kthread into kthread worker API · Corey Minyard <hidden> · 2015-11-23
Re: [PATCH v3 17/22] ipmi: Convert kipmi kthread into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-24
Re: [PATCH v3 17/22] ipmi: Convert kipmi kthread into kthread worker API · Corey Minyard <hidden> · 2015-11-24
[PATCH v3 18/22] IB/fmr_pool: Convert the cleanup thread into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 18/22] IB/fmr_pool: Convert the cleanup thread into kthread worker API · Yuval Shaia <hidden> · 2015-11-19
[PATCH v3 19/22] memstick/r592: Better synchronize debug messages in r592_io kthread · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 20/22] memstick/r592: convert r592_io kthread into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Jacob Pan <hidden> · 2016-01-07
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Petr Mladek <pmladek@suse.com> · 2016-01-08
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Jacob Pan <hidden> · 2016-01-12
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Petr Mladek <pmladek@suse.com> · 2016-01-12
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Jacob Pan <hidden> · 2016-01-12
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Petr Mladek <pmladek@suse.com> · 2016-01-13
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Jacob Pan <hidden> · 2016-01-13
Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API · Petr Mladek <pmladek@suse.com> · 2016-01-14
[PATCH v3 16/22] kmemleak: Convert kmemleak kthread into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 13/22] mm/huge_page: Convert khugepaged() into kthread worker API · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 12/22] kthread: Use try_lock_kthread_work() in flush_kthread_work() · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 08/22] kthread: Initial support for delayed kthread work · Petr Mladek <pmladek@suse.com> · 2015-11-18
[PATCH v3 05/22] kthread: Add drain_kthread_worker() · Petr Mladek <pmladek@suse.com> · 2015-11-18
Re: [PATCH v3 00/22] kthread: Use kthread worker API more widely · Paul E. McKenney <hidden> · 2015-11-18

Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API

From: Jacob Pan <hidden>
Date: 2016-01-07 19:56:46
Also in: linux-mm, linux-pm, lkml

On Wed, 18 Nov 2015 14:25:27 +0100
Petr Mladek [off-list ref] wrote:

From: Petr Mladek <pmladek@suse.com>
To: Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov
[off-list ref], Tejun Heo [off-list ref], Ingo Molnar
[off-list ref], Peter Zijlstra [off-list ref] Cc: Steven
Rostedt [off-list ref], "Paul E. McKenney"
[off-list ref], Josh Triplett [off-list ref],
Thomas Gleixner [off-list ref], Linus Torvalds
[off-list ref], Jiri Kosina [off-list ref],
Borislav Petkov [off-list ref], Michal Hocko [off-list ref],
linux-mm@kvack.org, Vlastimil Babka [off-list ref],
linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Mladek
[off-list ref], Zhang Rui [off-list ref], Eduardo Valentin
[off-list ref], Jacob Pan [off-list ref],
linux-pm@vger.kernel.org Subject: [PATCH v3 22/22]
thermal/intel_powerclamp: Convert the kthread to kthread worker API
Date: Wed, 18 Nov 2015 14:25:27 +0100 X-Mailer: git-send-email 1.8.5.6

Kthreads are currently implemented as an infinite loop. Each
has its own variant of checks for terminating, freezing,
awakening. In many cases it is unclear to say in which state
it is and sometimes it is done a wrong way.

The plan is to convert kthreads into kthread_worker or workqueues
API. It allows to split the functionality into separate operations.
It helps to make a better structure. Also it defines a clean state
where no locks are taken, IRQs blocked, the kthread might sleep
or even be safely migrated.

The kthread worker API is useful when we want to have a dedicated
single thread for the work. It helps to make sure that it is
available when needed. Also it allows a better control, e.g.
define a scheduling priority.

This patch converts the intel powerclamp kthreads into the kthread
worker because they need to have a good control over the assigned
CPUs.

I have tested this patchset and found no obvious issues in terms of
functionality, power and performance. Tested CPU online/offline,
suspend resume, freeze etc.
Power numbers are comparable too. e.g. on IVB 8C system. Inject idle
from 5 to 50% and read package power while running CPU bound workload.

Before:
IdlePct    Perf    RAPL    WallPower                               
5 256.28 16.50 0.0                                                 
10 248.86 15.64 0.0                                                
15 209.01 14.57 0.0                                                
20 176.17 13.88 0.0                                                
25 161.25 13.37 0.0                                                
30 165.62 13.38 0.0                                                
35 150.94 12.89 0.0                                                
40 137.45 12.47 0.0                                                
45 123.80 11.83 0.0                                                
50 137.59 11.80 0.0                                                

After:

(deb_chroot)root@ubuntu-jp-nfs:~/powercap-power# ./test.py -c 5
IdlePct	Perf	RAPL	WallPower
5 266.30 16.34 0.0
10 226.32 15.27 0.0
15 195.52 14.29 0.0
20 200.96 13.98 0.0
25 174.77 13.08 0.0
30 162.05 13.04 0.0
35 166.70 12.90 0.0
40 134.78 12.12 0.0
45 128.08 11.70 0.0
50 117.74 11.74 0.0

IMHO, the most natural way is to split one cycle into two works.
First one does some balancing and let the CPU work normal
way for some time. The second work checks what the CPU has done
in the meantime and put it into C-state to reach the required
idle time ratio. The delay between the two works is achieved
by the delayed kthread work.

The two works have to share some data that used to be local
variables of the single kthread function. This is achieved
by the new per-CPU struct kthread_worker_data. It might look
as a complication. On the other hand, the long original kthread
function was not nice either.

The patch tries to avoid extra init and cleanup works. All the
actions might be done outside the thread. They are moved
to the functions that create or destroy the worker. Especially,
I checked that the timers are assigned to the right CPU.

The two works are queuing each other. It makes it a bit tricky to
break it when we want to stop the worker. We use the global and
per-worker "clamping" variables to make sure that the re-queuing
eventually stops. We also cancel the works to make it faster.
Note that the canceling is not reliable because the handling
of the two variables and queuing is not synchronized via a lock.
But it is not a big deal because it is just an optimization.
The job is stopped faster than before in most cases.

I am not convinced this added complexity is necessary, here are my
concerns by breaking down into two work items.
- overhead of queuing, per cpu data as you already mentioned.
- since we need to have very tight timing control, two items may limit
  our turnaround time. Wouldn't it take one extra tick for the scheduler
  to run the balance work then add delay? as opposed to just
  schedule_timeout()?
- vulnerable to future changes of queuing work

Jacob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help