Re: [RFC PATCH v5 1/2] vfs: add O_CREAT|O_DIRECTORY to open*(2)
From: NeilBrown <hidden>
Date: 2026-06-03 22:56:22
Also in:
linux-fsdevel, lkml
On Wed, 03 Jun 2026, Jori Koolstra wrote:
quoted
Op 02-06-2026 17:44 CEST schreef Christian Brauner [off-list ref]: Yes, I agree. This would change error codes but I don't think it matters: * O_WRONLY | O_DIRECTORY on non-directory -> ENOTDIR * O_WRONLY | O_DIRECTORY on directory -> EISDIR I don't think that really matters and we should be able to collapse this to ENOTDIR.I will pick this up in the next version of O_CREAT|O_DIRECTORY. I think that makes most sense. I have an outstanding patch for changing the EACCES/EPERM to EOPNOTSUPP; Jeff and Jan were skeptical, but I want to know your opinion as well. I feel the the scenario where userspace has no fall-through but does handle every single -E listed in the man-page quite unlikely, so I say lets change them and we'll hear from them if somehow someone relied on this weird way of error handling.
Please cc linux-api@vger.kernel.org on code and discussions that involve API changes. I have cc:ed them on this reply. Thanks, NeilBrown
quoted
quoted
quoted
There is another point, I maybe should have mentioned in the cover letter: I have not attempted to handle dangling symlinks for O_MKDIR. Not because I think they are a great idea (as Aleksa has mentioned, but I am not very familiar with the dragons it entails), but I wanted to discuss what behavior we want in this case. Do we say that we never do a mkdir after following a lookup last symlink? I don't think that state is even recorded right now.I think the state might be recorded in nd->depth. But you probably don't want to use that directly. Maybe forcing LOOKUP_FOLLOW to be cleared if O_CREAT|O_DIRECTORY is set would be good. But what would stop you opening an existing directory through a symlink.... Probably we need a clear statement of intended semantics which we can review, agree on, then implement. Have you looked at preparing a patch for man-pages to document the change in behaviour for openat etc?Ugh, dangling symlinks. Actually, scratch that: Ugh, symlinks. So O_CREAT without O_NOFOLLOW allows you to create the target of a dangling symlink iirc. I always forget that. I think this is a very subtle bug and maybe - with both eyes closed - a feature at times. We should straighten the behavior for O_DIRECTORY | O_CREAT and we agreed on that during LSFMM. It would be nice if we could get away with simply implying O_NOFOLLOW but I think you're right, Neil, that this prevents a valid O_CREAT | O_DIRECTORY on an existing directory which we can't do. Makes this kind of a pointless excercise. But this shouldn't be all that crazy to do right. Using the O_CREAT as an _example_ for what we'd need: fs: refuse O_CREAT through a dangling symlink open(O_CREAT) without O_EXCL follows a trailing symlink and, when the symlink target does not exist, creates it. Refuse to create through a dangling symlink instead. In lookup_open() a negative target reached with nd->depth > 0 was arrived at by following a trailing symlink; since the dentry is negative the symlink is dangling. Set create_error to -ELOOP in that case. Reusing the existing create_error path strips O_CREAT for both the generic and ->atomic_open create paths and only reports the error when the target is actually negative, so opening an existing target through a symlink, interior symlinks, and O_EXCL (which never follows the trailing link) are all unaffected. Hastily-Cobbled-Together-by: Christian Brauner (Amutable) [off-list ref]diff --git a/fs/namei.c b/fs/namei.c index c7fac83c9a85..d20bbcc7e8d3 100644 --- a/fs/namei.c +++ b/fs/namei.c@@ -4468,6 +4468,9 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, dentry, mode); else create_error = -EROFS; + /* refuse to create through a dangling (trailing) symlink */ + if (unlikely(nd->depth) && !create_error) + create_error = -ELOOP; } if (create_error) open_flag &= ~O_CREAT;It can't be that easy...This is what I suggested above, correct, in terms of behavior? In terms of the patch, I think this will work, but struct nameidata could really use some commentary for its fields. I spent the last two hours verifying that nd->depth really does what I thought it did, and I am still not 100% positive. AFAIS, nd->depth indeed tracks the current symlink depth, which outside of link_path_walk() reduces to the number of trailing links followed. But if Neil's rework of lookup_open() is merged we lose access here to nd. @Neil, have you thought about what would be a good way to resolve that?