Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
From: Dmitry Kadashev <hidden>
Date: 2021-02-02 04:40:14
Also in:
io-uring
On Mon, Feb 1, 2021 at 10:00 PM Al Viro [off-list ref] wrote:
On Mon, Feb 01, 2021 at 06:09:01PM +0700, Dmitry Kadashev wrote:quoted
Hi Al, I think I need more guidance here. First of all, I've based that code on commit 7cdfa44227b0 ("vfs: Fix refcounting of filenames in fs_parser"), which does exactly the same refcount bump in fs_parser.c for filename_lookup(). I'm not saying it's a good excuse to introduce more code like that if that's a bad code though.It is a bad code. If you look at that function, you'll see that the entire mess around put_f is rather hard to follow and reason about. That's a function with no users, and I'm not sure we want to keep it long-term.
But the reason for the put_f mess is the fact that the function accepts either a string (which it resolves to a struct filename that it then owns) or a struct filename (that it does not own), not the meddling with the refcount. I'm not trying to argue that we should do the meddling though, I'm fine with the other approach.
quoted
What I _am_ saying is we probably want to make the approaches consistent (at least eventually), which means we'd need the same "don't drop the name" variant of filename_lookup?"don't drop the name on success", similar to what filename_parentat() does.
OK, that makes things much simpler.
quoted
And given the fact filename_parentat (used from filename_create) drops the name on error it looks like we'd need another copy of it too?No need.
OK.
quoted
Do you think it's really worth it or maybe all of these functions will make things more confusing? (from the looks of it right now the convention is that the `struct filename` ownership is always transferred when it is passed as an arg) Also, do you have a good name for such functions that do not drop the name? And, just for my education, can you explain why the reference counting for struct filename exists if it's considered a bad practice to increase the reference counter (assuming the cleanup code is correct)?The last one is the easiest to answer - we want to keep the imported strings around for audit. It's not so much a proper refcounting as it is "we might want freeing delayed" implemented as refcount. As for do_mkdirat(), you probably want semantics similar to do_unlinkat(), i.e. have it consume the argument passed to it. The main complication comes from ESTALE retries; want -ESTALE from ->mkdir() itself to trigger "redo filename_parentat() with LOOKUP_REVAL, then try the rest one more time". For which you need to keep filename around. OK, so you want a variant of filename_create() that would _not_ consume the filename on success (i.e. act as filename_parentat() itself does). Which is trivial to implement - just rename filename_create() to __filename_create() and remove one of two putname() in there, leaving just the one in failure exits. Then filename_create() itself becomes simply static inline struct dentry *filename_create(int dfd, struct filename *name, struct path *path, unsigned int lookup_flags) { struct dentry *res = __filename_create(dfd, name, path, lookup_flags); if (!IS_ERR(res)) putname(name); return res; } and in your do_mkdirat() replacement use dentry = __filename_create(dfd, filename, &path, lookup_flags); instead of dentry = user_path_create(dfd, pathname, &path, lookup_flags); and add putname(filename); in the very end. All it takes...
Yeah, I just was not sure about naming or whether you'd prefer for other functions to be changed too. You've answered pretty much all my questions and even more :) Thanks a lot Al! I'll post v2 soon (since the audit thing you've discovered does not affect this patch directly). -- Dmitry Kadashev