Thread (33 messages) 33 messages, 4 authors, 2021-07-22

Re: [RFC PATCH 00/10] perf: add workqueue library and use it in synthetic-events

From: Arnaldo Carvalho de Melo <acme@kernel.org>
Date: 2021-07-13 19:14:21
Also in: lkml

Em Tue, Jul 13, 2021 at 02:11:11PM +0200, Riccardo Mancini escreveu:
This patchset introduces a new utility library inside perf/util, which
provides a work queue abstraction, which loosely follows the Kernel
workqueue API.

The workqueue abstraction is made up by two components:
 - threadpool: which takes care of managing a pool of threads. It is
   inspired by the prototype for threaded trace in perf-record from Alexey:
   https://lore.kernel.org/lkml/cover.1625227739.git.alexey.v.bayduraev@linux.intel.com/ (local)
 - workqueue: manages a shared queue and provides the workers implementation.

On top of the workqueue, a simple parallel-for utility is implemented
which is then showcased in synthetic-events.c, replacing the previous
manual pthread-created threads.

Through some experiments with perf bench, I can see how the new 
workqueue has a higher overhead compared to manual creation of threads, 
but is able to more effectively partition work among threads, yielding 
a better result with more threads.
Furthermore, the overhead could be configured by changing the
`work_size` (currently 1), aka the number of dirents that are 
processed by a thread before grabbing a lock to get the new work item.
I experimented with different sizes but, while bigger sizes reduce overhead
as expected, they do not scale as well to more threads.

I tried to keep the patchset as simple as possible, deferring possible
improvements and features to future work.
Naming a few:
 - in order to achieve a better performance, we could consider using 
   work-stealing instead of a common queue.
 - affinities in the thread pool, as in Alexey prototype for
   perf-record. Doing so would enable reusing the same threadpool for
   different purposes (evlist open, threaded trace, synthetic threads),
   avoiding having to spin up threads multiple times.
 - resizable threadpool, e.g. for lazy spawining of threads.

@Arnaldo
Since I wanted the workqueue to provide a similar API to the Kernel's
workqueue, I followed the naming style I found there, instead of the
usual object__method style that is typically found in perf. 
Let me know if you'd like me to follow perf style instead.
You did the right thing, that is how we do with other kernel APIs, we
use list_add(), rb_first(), bitmap_weight(), hash_del(),  etc.

- Arnaldo
 
Thanks,
Riccardo

Riccardo Mancini (10):
  perf workqueue: threadpool creation and destruction
  perf tests: add test for workqueue
  perf workqueue: add threadpool start and stop functions
  perf workqueue: add threadpool execute and wait functions
  perf workqueue: add sparse annotation header
  perf workqueue: introduce workqueue struct
  perf workqueue: implement worker thread and management
  perf workqueue: add queue_work and flush_workqueue functions
  perf workqueue: add utility to execute a for loop in parallel
  perf synthetic-events: use workqueue parallel_for

 tools/perf/tests/Build                 |   1 +
 tools/perf/tests/builtin-test.c        |   9 +
 tools/perf/tests/tests.h               |   3 +
 tools/perf/tests/workqueue.c           | 453 +++++++++++++++++
 tools/perf/util/Build                  |   1 +
 tools/perf/util/synthetic-events.c     | 131 +++--
 tools/perf/util/workqueue/Build        |   2 +
 tools/perf/util/workqueue/sparse.h     |  21 +
 tools/perf/util/workqueue/threadpool.c | 516 ++++++++++++++++++++
 tools/perf/util/workqueue/threadpool.h |  29 ++
 tools/perf/util/workqueue/workqueue.c  | 642 +++++++++++++++++++++++++
 tools/perf/util/workqueue/workqueue.h  |  38 ++
 12 files changed, 1771 insertions(+), 75 deletions(-)
 create mode 100644 tools/perf/tests/workqueue.c
 create mode 100644 tools/perf/util/workqueue/Build
 create mode 100644 tools/perf/util/workqueue/sparse.h
 create mode 100644 tools/perf/util/workqueue/threadpool.c
 create mode 100644 tools/perf/util/workqueue/threadpool.h
 create mode 100644 tools/perf/util/workqueue/workqueue.c
 create mode 100644 tools/perf/util/workqueue/workqueue.h

-- 
2.31.1
-- 

- Arnaldo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help