Thread (26 messages) 26 messages, 4 authors, 2023-07-03

Re: [PATCH v3 03/10] eventfs: adding eventfs dir add functions

From: Ajay Kaher <hidden>
Date: 2023-07-03 10:13:34
Also in: linux-kselftest, lkml

On 01-Jul-2023, at 7:24 PM, Steven Rostedt [off-list ref] wrote:

!! External Email

FYI, all subjects should start with a capital letter:

"eventfs: Implement eventfs dir creation functions"

On Thu,  1 Jun 2023 14:30:06 +0530
Ajay Kaher [off-list ref] wrote:
quoted
Adding eventfs_file structure which will hold properties of file or dir.

Adding following functions to add dir in eventfs:

eventfs_create_events_dir() directly creates events dir with-in
                       "within" is a proper word.
quoted
tracing folder.

eventfs_add_subsystem_dir() adds the information of subsystem_dir to
eventfs and dynamically creates subsystem_dir as and when requires.
 "as and when requires" does not make sense.
quoted
eventfs_add_dir() adds the information of dir (which is with-in
  "within"
quoted
subsystem_dir) to eventfs and dynamically creates these dir as
and when requires.
I'm guessing you want to say:

       eventfs_add_dir() adds the information of the dir, within a
       subsystem_dir, to eventfs and dynamically creates these
       directories when they are accessed.
quoted
Signed-off-by: Ajay Kaher <redacted>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <redacted>
Reported-by: kernel test robot <redacted>
Link: https://lore.kernel.org/oe-lkp/202305051619.9a469a9a-yujie.liu@intel.com (local)
---
fs/tracefs/Makefile      |   1 +
fs/tracefs/event_inode.c | 272 +++++++++++++++++++++++++++++++++++++++
include/linux/tracefs.h  |  29 +++++
kernel/trace/trace.h     |   1 +
4 files changed, 303 insertions(+)
create mode 100644 fs/tracefs/event_inode.c
diff --git a/fs/tracefs/Makefile b/fs/tracefs/Makefile
index 7c35a282b..73c56da8e 100644
--- a/fs/tracefs/Makefile
+++ b/fs/tracefs/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0-only
tracefs-objs := inode.o
+tracefs-objs += event_inode.o

obj-$(CONFIG_TRACING)        += tracefs.o
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
new file mode 100644
index 000000000..a48ce23c0
--- /dev/null
+++ b/fs/tracefs/event_inode.c
@@ -0,0 +1,272 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ *  event_inode.c - part of tracefs, a pseudo file system for activating tracing
+ *
+ *  Copyright (C) 2020-22 VMware Inc, author: Steven Rostedt (VMware) <rostedt@goodmis.org>
+ *  Copyright (C) 2020-22 VMware Inc, author: Ajay Kaher <akaher@vmware.com>
+ *
+ *  eventfs is used to show trace events with one set of dentries
+ *
+ *  eventfs stores meta-data of files/dirs and skip to create object of
+ *  inodes/dentries. As and when requires, eventfs will create the
+ *  inodes/dentries for only required files/directories. Also eventfs
+ *  would delete the inodes/dentries once no more requires but preserve
+ *  the meta data.
+ */
+#include <linux/fsnotify.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/security.h>
+#include <linux/tracefs.h>
+#include <linux/kref.h>
+#include <linux/delay.h>
+#include "internal.h"
+
+/**
+ * eventfs_dentry_to_rwsem - Return corresponding eventfs_rwsem
+ * @dentry: a pointer to dentry
+ *
+ * helper function to return crossponding eventfs_rwsem for given dentry
+ */
+static struct rw_semaphore *eventfs_dentry_to_rwsem(struct dentry *dentry)
+{
+     if (S_ISDIR(dentry->d_inode->i_mode))
+             return (struct rw_semaphore *)dentry->d_inode->i_private;
+     else
+             return (struct rw_semaphore *)dentry->d_parent->d_inode->i_private;
+}
+
+/**
+ * eventfs_down_read - acquire read lock function
+ * @eventfs_rwsem: a pointer to rw_semaphore
+ *
+ * helper function to perform read lock. Nested locking requires because
+ * lookup(), release() requires read lock, these could be called directly
+ * or from open(), remove() which already hold the read/write lock.
+ */
+static void eventfs_down_read(struct rw_semaphore *eventfs_rwsem)
+{
+     down_read_nested(eventfs_rwsem, SINGLE_DEPTH_NESTING);
+}
+
+/**
+ * eventfs_up_read - release read lock function
+ * @eventfs_rwsem: a pointer to rw_semaphore
+ *
+ * helper function to release eventfs_rwsem lock if locked
+ */
+static void eventfs_up_read(struct rw_semaphore *eventfs_rwsem)
+{
+     up_read(eventfs_rwsem);
+}
+
+/**
+ * eventfs_down_write - acquire write lock function
+ * @eventfs_rwsem: a pointer to rw_semaphore
+ *
+ * helper function to perform write lock on eventfs_rwsem
+ */
+static void eventfs_down_write(struct rw_semaphore *eventfs_rwsem)
+{
+     while (!down_write_trylock(eventfs_rwsem))
+             msleep(10);
What's this loop for? Something like that needs a very good explanation
in a comment. Loops like these are usually a sign of a workaround for a
bug in the design, or worse, simply hides an existing bug.
Yes correct, this logic is to solve deadlock:

Thread 1                             Thread 2
down_read_nested()                                 - read lock acquired
                                         down_write()     - waiting for write lock to acquire
down_read_nested()                                  - deadlock

Deadlock is because rwlock wouldn’t allow read lock to be acquired if write lock is waiting.
down_write_trylock() wouldn’t add the write lock in waiting queue, hence helps to prevent
deadlock scenario.

I was stuck with this Deadlock, tried few methods and finally borrowed from cifs, as it’s
upstreamed, tested and working in cifs, please refer:
https://elixir.bootlin.com/linux/v6.3.1/source/fs/cifs/file.c#L438

Looking further for your input. I will add explanation in v4.

quoted
+}
+
+/**
+ * eventfs_up_write - release write lock function
+ * @eventfs_rwsem: a pointer to rw_semaphore
+ *
+ * helper function to perform write lock on eventfs_rwsem
+ */
+static void eventfs_up_write(struct rw_semaphore *eventfs_rwsem)
+{
+     up_write(eventfs_rwsem);
+}
+
+static const struct file_operations eventfs_file_operations = {
+};
+
+static const struct inode_operations eventfs_root_dir_inode_operations = {
+};
+
+/**
+ * eventfs_prepare_ef - helper function to prepare eventfs_file
+ * @name: a pointer to a string containing the name of the file/directory
+ *        to create.
+ * @mode: the permission that the file should have.
+ * @fop: a pointer to a struct file_operations that should be used for
+ *        this file/directory.
+ * @iop: a pointer to a struct inode_operations that should be used for
+ *        this file/directory.
+ * @data: a pointer to something that the caller will want to get to later
+ *        on.  The inode.i_private pointer will point to this value on
+ *        the open() call.
+ *
+ * This function allocate the fill eventfs_file structure.
  "allocates and fills the" ?
quoted
+ */
+static struct eventfs_file *eventfs_prepare_ef(const char *name, umode_t mode,
+                                     const struct file_operations *fop,
+                                     const struct inode_operations *iop,
+                                     void *data)
+{
+     struct eventfs_file *ef;
+
+     ef = kzalloc(sizeof(*ef), GFP_KERNEL);
+     if (!ef)
+             return ERR_PTR(-ENOMEM);
+
+     ef->name = kstrdup(name, GFP_KERNEL);
+     if (!ef->name) {
+             kfree(ef);
+             return ERR_PTR(-ENOMEM);
+     }
+
+     if (S_ISDIR(mode)) {
+             ef->ei = kzalloc(sizeof(*ef->ei), GFP_KERNEL);
+             if (!ef->ei) {
+                     kfree(ef->name);
+                     kfree(ef);
+                     return ERR_PTR(-ENOMEM);
+             }
+             INIT_LIST_HEAD(&ef->ei->e_top_files);
+     } else {
+             ef->ei = NULL;
+     }
+
+     ef->iop = iop;
+     ef->fop = fop;
+     ef->mode = mode;
+     ef->data = data;
+     ef->dentry = NULL;
+     ef->d_parent = NULL;
+     ef->created = false;
No need for the initialization to NULL or even the false, as the
kzalloc() already did that.
quoted
+     return ef;
+}
+
+/**
+ * eventfs_create_events_dir - create the trace event structure
+ * @name: a pointer to a string containing the name of the directory to
+ *        create.
You don't need to add "a pointer" we can see it's a pointer. Just say:

* @name: The name of the directory to create

Adding more makes it confusing to read.
quoted
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is NULL, then the
+ *          directory will be created in the root of the tracefs filesystem.
+ * @eventfs_rwsem: a pointer to rw_semaphore
Same with all the descriptions.

quoted
+ *
+ * This function creates the top of the trace event directory.
+ */
+struct dentry *eventfs_create_events_dir(const char *name,
+                                      struct dentry *parent,
+                                      struct rw_semaphore *eventfs_rwsem)
OK, I'm going to have to really look at this. Passing in a lock to the
API is just broken. We need to find a way to solve this another way.
eventfs_rwsem is a member of struct trace_array, I guess we should
pass pointer to trace_array.

I'm about to board a plane to JFK shortly, I'm hoping to play with this
while flying back.
I have replied for major concerns. All other minor I will take care in v4.

Thanks a lot for giving time to eventfs patches.

- Ajay

-- Steve

quoted
+{
+     struct dentry *dentry = tracefs_start_creating(name, parent);
+     struct eventfs_inode *ei;
+     struct tracefs_inode *ti;
+     struct inode *inode;
+
+     if (IS_ERR(dentry))
+             return dentry;
+
+     ei = kzalloc(sizeof(*ei), GFP_KERNEL);
+     if (!ei)
+             return ERR_PTR(-ENOMEM);
+     inode = tracefs_get_inode(dentry->d_sb);
+     if (unlikely(!inode)) {
+             kfree(ei);
+             tracefs_failed_creating(dentry);
+             return ERR_PTR(-ENOMEM);
+     }
+
+     init_rwsem(eventfs_rwsem);
+     INIT_LIST_HEAD(&ei->e_top_files);
+
+     ti = get_tracefs(inode);
+     ti->flags |= TRACEFS_EVENT_INODE;
+     ti->private = ei;
+
+     inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
+     inode->i_op = &eventfs_root_dir_inode_operations;
+     inode->i_fop = &eventfs_file_operations;
+     inode->i_private = eventfs_rwsem;
+
+     /* directory inodes start off with i_nlink == 2 (for "." entry) */
+     inc_nlink(inode);
+     d_instantiate(dentry, inode);
+     inc_nlink(dentry->d_parent->d_inode);
+     fsnotify_mkdir(dentry->d_parent->d_inode, dentry);
+     return tracefs_end_creating(dentry);
+}
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help