Thread (23 messages) 23 messages, 3 authors, 2017-01-12

Re: [RFC][PATH 4/4] ovl: relax lock_rename when moving files between work and upper dir

From: Amir Goldstein <amir73il@gmail.com>
Date: 2017-01-12 05:42:23
Also in: linux-fsdevel

On Fri, Nov 11, 2016 at 7:44 PM, Miklos Szeredi [off-list ref] wrote:
On Fri, Nov 11, 2016 at 6:27 PM, Al Viro [off-list ref] wrote:
quoted
On Fri, Nov 11, 2016 at 06:17:07PM +0200, Amir Goldstein wrote:
quoted
I'm afraid so.
It seems to me that most of the time, lock_rename() is being used in
overlayfs for no better alternative to lock 2 directories (work+upper).

My suggestion is a small modification to the overlayfs locking scheme.
---Instead of this:
assert(lock_rename(workdir, upperdir) != NULL));
   copy_up(src, tmp);
   vfs_rename(tmp, dst);
unlock_rename(workdir, upperdir);

+++Use this:
assert(lock_rename(workdir, upperdir) != NULL));
mutex_unlock(s_vfs_rename_mutex);
   copy_up(src, tmp);
inode_unlock(upperdir);
inode_unlock(workdir);
assert(lock_rename(workdir, upperdir) != NULL));
  vfs_rename(tmp, dst);
unlock_rename(workdir, upperdir);

Miklos,

Do you see any problem with the proposed scheme?
Anything that can go wrong while releasing the workdir lock before vfs_rename()?
Huh???

->rename() definitely counts upon parents being locked; please, read the
damn Documentation/filesystems/locking, it's there for a reason.

The real question is why the fsck do you need to lock the workdir for the
duration of copying at all.  O_TMPFILE open + writes there doesn't need
lock_rename() *or* parents being locked.  You need the parent locked when
you link the sucker in place, but that's it.  IDGI...
There's that.  The other thing that lock_rename does is prevent
multiple copy ups on the same file.   Arguably it's an overkill, but a
replacement needs to be added.
Miklos,

FYI, I started to work on copy up using O_TMPFILE with i_mutex
held only for upper dir.
There's that.  The other thing that lock_rename does is prevent
multiple copy ups on the same file.   Arguably it's an overkill, but a
replacement needs to be added.
I think you suggested here to use a waitqueue for pending copyups?
Did you mean a waitqueue per overlay parent directory inode or
did you mean something else?
Fact is, nobody ever reported this blatant performance bottleneck.
Probably because copying up gigabyte files is not the usual use case
for union filesystems...
Still, it's a loophole for some serious DoS. An unprivileged user inside
container can touch a file and block all copy up on all other containers
on that host and all renames on that host fs as well!

Amir.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help