Re: [PATCH v4 18/21] fuse: Add support for pid namespaces
From: Seth Forshee <hidden>
Date: 2016-07-20 12:52:04
Also in:
dm-devel, linux-bcache, linux-fsdevel, linux-raid, lkml, selinux
On Tue, Jul 19, 2016 at 07:44:11PM -0700, Sheng Yang wrote:
On Tue, Apr 26, 2016 at 12:36 PM, Seth Forshee [off-list ref] wrote:quoted
When the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd are not being translated into that process' namespace. Translation is necessary for the pid to be useful to that process. Since no use case currently exists for changing namespaces all translations can be done relative to the pid namespace in use when fuse_conn_init() is called. For fuse this translates to mount time, and for cuse this is when /dev/cuse is opened. IO for this connection from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace are not permitted, except for requests allocated via fuse_get_req_nofail_nopages. For no-fail requests in.h.pid will be 0 if the pid translation fails.Hi Seth, This patch caused a regression in our major container use case with FUSE in Ubuntu 16.04, as patch was checked in as Ubuntu Sauce in Ubuntu 4.4.0-6.21 kernel. The use case is: 1. Create a Docker container. 2. Inside the container, start the FUSE backend, and mounted fs. 3. Following step 2 in the container, create a loopback device to map a file in the mounted fuse to create a block device, which will be available to the whole system. It works well before this commit. The use case is broken because no matter which namespace losetup runs, the real request from loopback device seems always come from init ns, thus it will be in different ns running fuse backend. So the request will got denied, because the ns running fuse won't able to see the things from higher level(level 0 in fact) pid namespace. I think since init pid ns has ability to access any process in the system, it should able to access the fuse mounted by any pid namespace process as well. What you think?
It sounds like we need to remove the restriction on accessing the filesystem from a different pid namespace. I don't think this poses a security problem. However there's no pid mapping that is usable by the userspace fuse process, so what do we put in the fuse request? Probably the only candidates are 0 and 0xffffffff. So a question for the fuse developers - is one value or the other preferrable for fuse_in_header.pid when the pid cannot be mapped, and is this going to cause problems for any fuse filesystems? I suspect that few filesystems actually look at the pid anyway, and already for a filesystem mounted in a pid namespace the values being given to userspace won't be correct for the namespace of the fuse process. Seth