Thread (42 messages) 42 messages, 10 authors, 2017-01-30

Re: [Lsf-pc] [LSF/MM TOPIC] I/O error handling and fsync()

From: Trond Myklebust <hidden>
Date: 2017-01-24 03:34:04
Also in: linux-fsdevel

On Tue, 2017-01-24 at 11:16 +1100, NeilBrown wrote:
On Mon, Jan 23 2017, Trond Myklebust wrote:
quoted
On Mon, 2017-01-23 at 17:35 -0500, Jeff Layton wrote:
quoted
On Mon, 2017-01-23 at 11:09 +0100, Kevin Wolf wrote:
quoted
However, if we look at the greater problem of hanging requests
that
came
up in the more recent emails of this thread, it is only moved
rather
than solved. Chances are that already write() would hang now
instead of
only fsync(), but we still have a hard time dealing with this.
Well, it _is_ better with O_DIRECT as you can usually at least
break
out
of the I/O with SIGKILL.

When I last looked at this, the problem with buffered I/O was
that
you
often end up waiting on page bits to clear (usually PG_writeback
or
PG_dirty), in non-killable sleeps for the most part.

Maybe the fix here is as simple as changing that?
At the risk of kicking off another O_PONIES discussion: Add an
open(O_TIMEOUT) flag that would let the kernel know that the
application is prepared to handle timeouts from operations such as
read(), write() and fsync(), then add an ioctl() or syscall to
allow
said application to set the timeout value.
I was thinking on very similar lines, though I'd use 'fcntl()' if
possible because it would be a per-"file description" option.
This would be a function of the page cache, and a filesystem wouldn't
need to know about it at all.  Once enable, 'read', 'write', or
'fsync'
would return EWOULDBLOCK rather than waiting indefinitely.
It might be nice if 'select' could then be used on page-cache file
descriptors, but I think that is much harder.  Support O_TIMEOUT
would
be a practical first step - if someone agreed to actually try to use
it.
The reason why I'm thinking open() is because it has to be a contract
between a specific application and the kernel. If the application
doesn't open the file with the O_TIMEOUT flag, then it shouldn't see
nasty non-POSIX timeout errors, even if there is another process that
is using that flag on the same file.

The only place where that is difficult to manage is when the file is
mmap()ed (no file descriptor), so you'd presumably have to disallow
mixing mmap and O_TIMEOUT.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
��칻
�&ޱ��jg���
�+a�{.n�+����{��h����ܭ�f���h��/i�(�h�j+z)ߢ�ˊ{�0�
zm����	b��f����:'�隊V����j)m��'�K�rJ+�隊Y/i�(��
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help