Thread (26 messages) 26 messages, 4 authors, 2024-01-15

Re: [PATCH v11 2/5] ring-buffer: Introducing ring-buffer mapping functions

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2024-01-11 23:22:18
Also in: lkml

On Thu, 11 Jan 2024 11:34:58 -0500
Mathieu Desnoyers [off-list ref] wrote:

The LTTng kernel tracer has supported mmap'd buffers for nearly 15 years [1],
and has a lot of similarities with this patch series.

LTTng has the notion of "subbuffer id" to allow atomically exchanging a
"reader" extra subbuffer with the subbuffer to be read. It implements
"get subbuffer" / "put subbuffer" ioctls to allow the consumer (reader)
to move the currently read subbuffer position. [2]

It would not hurt to compare your approach to LTTng and highlight
similarities/differences, and the rationale for the differences.

Especially when it comes to designing kernel ABIs, it's good to make sure
that all bases are covered, because those choices will have lasting impacts.

Thanks,

Mathieu

[1] https://git.lttng.org/?p=lttng-modules.git;a=blob;f=src/lib/ringbuffer/ring_buffer_mmap.c
[2] https://git.lttng.org/?p=lttng-modules.git;a=blob;f=src/lib/ringbuffer/ring_buffer_vfs.c
Hi Mathieu,

Thanks for sharing!

As we discussed a little bit in the tracing meeting we do somethings
differently but very similar too ;-)

The similarities as that all the sub-buffers are mapped. You have a
reader-sub-buffer as well.

The main difference is that you use an ioctl() that returns where to find
the reader-sub-buffer, where our ioctl() is just "I'm done, get me a new
reader-sub-buffer". Ours will update the meta page.

Our meta page looks like this:
+struct trace_buffer_meta {
+	unsigned long	entries;
+	unsigned long	overrun;
+	unsigned long	read;
If start tracing: trace-cmd start -e sched_switch and do:

 ~ cat /sys/kernel/tracing/per_cpu/cpu0/stats 
entries: 14
overrun: 0
commit overrun: 0
bytes: 992
oldest event ts: 84844.825372
now ts: 84847.102075
dropped events: 0
read events: 0

You'll see similar to the above.

 entries = entries
 overrun = overrun
 read = read

The above "read" is total number of events read.

Pretty staight forward ;-)

+
+	unsigned long	subbufs_touched;
+	unsigned long	subbufs_lost;
+	unsigned long	subbufs_read;
Now I'm thinking we may not want this exported, as I'm not sure it's useful.

Vincent, talking with Mathieu, he was suggesting that we only export what
we really need, and I don't think we need the above. Especially since they
are not exposed in the stats file.

+
+	struct {
+		unsigned long	lost_events;
+		__u32		id;
+		__u32		read;
+	} reader;
The above is definitely needed, as all of it is used to read the
reader-page of the sub-buffer.

lost_events is the number of lost events that happened before this
sub-buffer was swapped out.

Hmm, Vincent, I just notice that you update the lost_events as:
+static void rb_update_meta_page(struct ring_buffer_per_cpu *cpu_buffer)
+{
+	struct trace_buffer_meta *meta = cpu_buffer->meta_page;
+
+	WRITE_ONCE(meta->entries, local_read(&cpu_buffer->entries));
+	WRITE_ONCE(meta->overrun, local_read(&cpu_buffer->overrun));
+	WRITE_ONCE(meta->read, cpu_buffer->read);
+
+	WRITE_ONCE(meta->subbufs_touched, local_read(&cpu_buffer->pages_touched));
+	WRITE_ONCE(meta->subbufs_lost, local_read(&cpu_buffer->pages_lost));
+	WRITE_ONCE(meta->subbufs_read, local_read(&cpu_buffer->pages_read));
+
+	WRITE_ONCE(meta->reader.read, cpu_buffer->reader_page->read);
+	WRITE_ONCE(meta->reader.id, cpu_buffer->reader_page->id);
+	WRITE_ONCE(meta->reader.lost_events, cpu_buffer->lost_events);
+}
The lost_events may need to be handled differently, as it doesn't always
get cleared. So it may be stale data.

+
+	__u32		subbuf_size;
+	__u32		nr_subbufs;
This gets is the information needed to read the mapped ring buffer.
+
+	__u32		meta_page_size;
+	__u32		meta_struct_len;
The ring buffer gets mapped after "meta_page_size" and this structure is
"meta_struct_len" which will change if we add new data to it.
+};
-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help