Thread (3 messages) 3 messages, 2 authors, 2025-11-26

Re: [PATCH v2] tracing: Add system trigger file to enable triggers for all the system's events

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-11-26 14:38:43
Also in: lkml

On Wed, 26 Nov 2025 15:02:51 +0900
Masami Hiramatsu (Google) [off-list ref] wrote:
quoted
This also allows to remove a trigger from all events in a subsystem (even
if it's not a subsystem trigger!).
  
I have some comments below.
BTW, it's more appropriate to simply trim the email ;-)

quoted
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2168,51 +2168,52 @@ event_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
 
 static LIST_HEAD(event_subsystems);
 
-static int subsystem_open(struct inode *inode, struct file *filp)
+struct trace_subsystem_dir *trace_get_system_dir(struct inode *inode)
 {
-	struct trace_subsystem_dir *dir = NULL, *iter_dir;
-	struct trace_array *tr = NULL, *iter_tr;
-	struct event_subsystem *system = NULL;
-	int ret;
+	struct trace_subsystem_dir *dir;
+	struct trace_array *tr = NULL;  
nit: This also no need to be initialized.
Hmm, I guess this was needed in one of the versions I had before posting.

I'll fix in v3.
quoted
 
-	if (tracing_is_disabled())
-		return -ENODEV;
+	guard(mutex)(&event_mutex);
+	guard(mutex)(&trace_types_lock);
 
 	/* Make sure the system still exists */
-	mutex_lock(&event_mutex);
-	mutex_lock(&trace_types_lock);
-	list_for_each_entry(iter_tr, &ftrace_trace_arrays, list) {
-		list_for_each_entry(iter_dir, &iter_tr->systems, list) {
-			if (iter_dir == inode->i_private) {
+	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
+		list_for_each_entry(dir, &tr->systems, list) {
+			if (dir == inode->i_private) {
 				/* Don't open systems with no events */
-				tr = iter_tr;
-				dir = iter_dir;
-				if (dir->nr_events) {
-					__get_system_dir(dir);
-					system = dir->subsystem;
-				}
-				goto exit_loop;
+				if (!dir->nr_events)
+					return NULL;
+				if (__trace_array_get(tr) < 0)
+					return NULL;
+				__get_system_dir(dir);
+				return dir;
 			}
 		}
 	}
quoted
 static ssize_t event_trigger_regex_write(struct file *file,
 					 const char __user *ubuf,
 					 size_t cnt, loff_t *ppos)
 {
 	struct trace_event_file *event_file;
 	ssize_t ret;
-	char *buf __free(kfree) = NULL;
+	char *buf __free(kfree) = get_user_buf(ubuf, cnt);
 
-	if (!cnt)
+	if (!buf)
 		return 0;
 
-	if (cnt >= PAGE_SIZE)
-		return -EINVAL;
-
-	buf = memdup_user_nul(ubuf, cnt);
 	if (IS_ERR(buf))
 		return PTR_ERR(buf);  
You can simply write:

	if (IS_ERR_OR_NULL(buf))
		return PTR_ERR_OR_ZERO(buf);
Yes I can. But honestly, the above is much harder to understand what is
happening than the code I had written.

I mean:

	if (!buf)
		return 0;

	if (IS_ERR(buf))
		return PTR_ERR(buf);

is pretty obvious of what is happening.

	if (IS_ERR_OR_NULL(buf))
		return PTR_ERR_OR_ZERO(buf);

Is quite a bit more obfuscated. I mean, I needed to look up what
PTR_ERR_OR_ZERO() did to be sure I knew what was returned.
quoted
+	list_for_each_entry(file, &tr->events, list) {
+
+		if (strcmp(system->name, file->event_call->class->system) != 0)
+			continue;
+
+		ret = p->parse(p, file, buff, command, next);
+
+		/* Removals and existing events do not error */
+		if (ret < 0 && ret != -EEXIST && !remove) {
+			pr_warn("Failed adding trigger %s on %s\n",
+				command, trace_event_name(file->event_call));
+		}  

Can I expect that this can recover the previous settings
via event trigger?
e.g. 

# echo "stacktrace" > events/sched/sched_wakeup/trigger
# echo "stacktrace" > events/sched/trigger
# echo "!stacktrace" > events/sched/trigger
# cat events/sched/sched_wakeup/trigger
stacktrace:unlimited

?
No. In fact, this is one of the features of the system trigger. Writing
into the system/trigger file is the same as writing into each of the
system's event's trigger files one at a time. In fact, I updated the
documentation in this patch to show that this file can be used to clear
tiggers too!

+       echo snapshot > events/sched/sched_waking/trigger
+       cat events/sched/sched_waking/trigger
+       snapshot:unlimited
+       echo '!snapshot' > events/sched/trigger
+       cat events/sched/sched_waking/trigger
+       # Available triggers:
+       # traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist

quoted
+	}
+	return 0;
+}
+
+static ssize_t
+event_system_trigger_write(struct file *filp, const char __user *ubuf,
+		    size_t cnt, loff_t *ppos)
+{
+	struct trace_subsystem_dir *dir = filp->private_data;
+	struct event_command *p;
+	char *command, *next;
+	char *buf __free(kfree) = get_user_buf(ubuf, cnt);
+	bool remove = false;
+	bool found = false;
+	ssize_t ret;
+
+	if (!buf)
+		return 0;
+
+	if (IS_ERR(buf))
+		return PTR_ERR(buf);  
Ditto.
And ditto again with my reply ;-)
quoted
+
+	/* system triggers are not allowed to have counters */
+	if (strchr(buf, ':'))
+		return -EINVAL;  
':' is not always used for counters (e.g. hist) and it seems odd
to check anything about parse here. Can we do this counter check
after parse a command?
quoted
+
+	/* If opened for read too, dir is in the seq_file descriptor */
+	if (filp->f_mode & FMODE_READ) {
+		struct seq_file *m = filp->private_data;
+		dir = m->private;
+	}
+
+	/* Skip added space at beginning of buf */
+	next = strim(buf);
+
+	command = strsep(&next, " \t");
+	if (next) {
+		next = skip_spaces(next);
+		if (!*next)
+			next = NULL;
+	}  
strim() removes both leading and trailing whitespace. So this check is
not required.
But next here is not the one that had strim() attached to it.

	command = strsep(&next, " \t");

Updates the content of next.

Thanks for the review,

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help