Re: Problem with providing implementation id in NFSv4.1
From: NeilBrown <hidden>
Date: 2022-07-08 07:02:08
On Thu, 07 Jul 2022, Trond Myklebust wrote:
On Thu, 2022-07-07 at 11:13 +1000, NeilBrown wrote:quoted
In NFSv4.1 when we EXCHANGE_ID to talk to a new server - possibly a PNFS Data Server that we haven't talked to before - we by default send an implementation id. This is created from several fields obtained from utsname(). utsname() depends on current->nsproxy, and will crash if that is NULL. When a process exits it calls, among other things, exit_task_namespaces(tsk); exit_task_work(tsk); exit_task_namespaces() will set ->nsproxy to NULL exit_task_work() will run delayed work items, including fput() on all files that were still open when the process exited. This will cause any pending writes to be flushed for NFS. So if a process writes to a file on a PNFS server, exits, and the MDS tells the client to send the data to a DS which it hasn't established a connection with before, then it will crash in encode_exchange_id(). That order of calls in do_exit() is deliberate so we cannot swap them - see Commit: 8aac62706ada ("move exit_task_namespaces() outside of exit_notify()") The options that I can see are: 1/ generate the implementation-id string at mount time and keep it around much like we do for cl_owner_id 2/ Check current->nsproxy in encode_exchange_id() and skip the implementation id if ->nsproxy is not available. Note that there is no risk for a race with testing ->nxproxy. Doesn't anyone have a strong opinion of which is best. I'm inclined to go with '2', but mostly because it is less coding.I'd be fine with ignoring the implementation id in that case. The protocol doesn't require it, so servers are expecte to be able to deal with that case.
Thanks for the quick reply. I agree with you. However it turns out that this isn't a problem in mainline. When you close a file the ->flush() happens earlier in do_exit() and the name spaces are still around. However if you have a file mapped and have dirty pages, then mainline doesn't force a flush on final unmap, or exit of the process that had it mapped. As I explained in https://lore.kernel.org/all/150304037195.30218.15740287358704674869.stgit@noble/ (local) I think this is wrong, so I have an fsync() call in nfs_file_release(). This *is* run after nsproxy has been cleared. So mainline doesn't need this fix, but I do. Thanks, NeilBrown