Thread (13 messages) 13 messages, 2 authors, 2022-10-17

Re: [PATCH v2] perf: Rewrite core context handling

From: Peter Zijlstra <peterz@infradead.org>
Date: 2022-10-12 12:17:11
Also in: linux-arm-kernel, linux-perf-users, linux-s390, lkml

On Wed, Oct 12, 2022 at 02:09:00PM +0530, Ravi Bangoria wrote:
quoted
@@ -3366,6 +3370,14 @@ static void perf_event_sync_stat(struct
 	}
 }
 
+#define list_for_each_entry_double(pos1, pos2, head1, head2, member)	\
+	for (pos1 = list_first_entry(head1, typeof(*pos1), member),	\
+	     pos2 = list_first_entry(head2, typeof(*pos2), member);	\
+	     !list_entry_is_head(pos1, head1, member) &&		\
+	     !list_entry_is_head(pos2, head2, member);			\
+	     pos1 = list_next_entry(pos1, member),			\
+	     pos2 = list_next_entry(pos2, member))
+
 static void perf_event_swap_task_ctx_data(struct perf_event_context *prev_ctx,
 					  struct perf_event_context *next_ctx)
 {
@@ -3374,16 +3386,9 @@ static void perf_event_swap_task_ctx_dat
 	if (!prev_ctx->nr_task_data)
 		return;
 
-	prev_epc = list_first_entry(&prev_ctx->pmu_ctx_list,
-				    struct perf_event_pmu_context,
-				    pmu_ctx_entry);
-	next_epc = list_first_entry(&next_ctx->pmu_ctx_list,
-				    struct perf_event_pmu_context,
-				    pmu_ctx_entry);
-
-	while (&prev_epc->pmu_ctx_entry != &prev_ctx->pmu_ctx_list &&
-	       &next_epc->pmu_ctx_entry != &next_ctx->pmu_ctx_list) {
-
+	list_for_each_entry_double(prev_epc, next_epc,
+				   &prev_ctx->pmu_ctx_list, &next_ctx->pmu_ctx_list,
+				   pmu_ctx_entry) {
There are more places which can use list_for_each_entry_double().
I'll fix those.
I've gone and renamed it: double_list_for_each_entry(), but yeah, didn't
look too hard for other users.
quoted
@@ -4859,7 +4879,14 @@ static void put_pmu_ctx(struct perf_even
 	if (epc->ctx) {
 		struct perf_event_context *ctx = epc->ctx;
 
-		// XXX ctx->mutex
+		/*
+		 * XXX
+		 *
+		 * lockdep_assert_held(&ctx->mutex);
+		 *
+		 * can't because of the call-site in _free_event()/put_event()
+		 * which isn't always called under ctx->mutex.
+		 */
Yes. I came across the same and could not figure out how to solve
this. So Just kept XXX as is.
Yeah, I can sorta fix it, but it's ugly so there we are.
quoted
 
 		WARN_ON_ONCE(list_empty(&epc->pmu_ctx_entry));
 		raw_spin_lock_irqsave(&ctx->lock, flags);
quoted
@@ -12657,6 +12675,13 @@ perf_event_create_kernel_counter(struct
 		goto err_unlock;
 	}
 
+	pmu_ctx = find_get_pmu_context(pmu, ctx, event);
+	if (IS_ERR(pmu_ctx)) {
+		err = PTR_ERR(pmu_ctx);
+		goto err_unlock;
+	}
+	event->pmu_ctx = pmu_ctx;
We should call find_get_pmu_context() with ctx->mutex held and thus
above perf_event_create_kernel_counter() change. Is my understanding
correct?
That's the intent yeah. But due to not always holding ctx->mutex over
put_pmu_ctx() this might be moot. I'm almost through auditing epc usage
and I think ctx->lock is sufficient, fingers crossed.
quoted
+
 	if (!task) {
 		/*
 		 * Check if the @cpu we're creating an event for is online.
quoted
@@ -12998,7 +13022,7 @@ void perf_event_free_task(struct task_st
 	struct perf_event_context *ctx;
 	struct perf_event *event, *tmp;
 
-	ctx = rcu_dereference(task->perf_event_ctxp);
+	ctx = rcu_access_pointer(task->perf_event_ctxp);
We dereference ctx pointer but with mutex and lock held. And thus
rcu_access_pointer() is sufficient. Is my understanding correct?
We do not in fact hold ctx->lock here IIRC; but this is a NULL test, if
it is !NULL we know we have a reference on it and are good.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help