Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling

[PATCH v2 00/11] DAX fsynx/msync support · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 01/11] pmem: add wb_cache_pmem() to the PMEM API · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 02/11] mm: add pmd_mkclean() · Ross Zwisler <hidden> · 2015-11-14
Re: [PATCH v2 02/11] mm: add pmd_mkclean() · Dave Hansen <hidden> · 2015-11-14
Re: [PATCH v2 02/11] mm: add pmd_mkclean() · Ross Zwisler <hidden> · 2015-11-17
[PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-14
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dan Williams <hidden> · 2015-11-14
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Andreas Dilger <hidden> · 2015-11-14
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dan Williams <hidden> · 2015-11-14
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Jan Kara <jack@suse.cz> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Jan Kara <jack@suse.cz> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dan Williams <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dan Williams <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dave Chinner <david@fromorbit.com> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Dave Chinner <david@fromorbit.com> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-16
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Jan Kara <jack@suse.cz> · 2015-11-18
Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling · Ross Zwisler <hidden> · 2015-11-18
[PATCH v2 04/11] dax: support dirty DAX entries in radix tree · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 05/11] mm: add follow_pte_pmd() · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 06/11] mm: add pgoff_mkclean() · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 07/11] mm: add find_get_entries_tag() · Ross Zwisler <hidden> · 2015-11-14
Re: [PATCH v2 07/11] mm: add find_get_entries_tag() · Dave Chinner <david@fromorbit.com> · 2015-11-16
Re: [PATCH v2 07/11] mm: add find_get_entries_tag() · Ross Zwisler <hidden> · 2015-11-17
[PATCH v2 08/11] dax: add support for fsync/sync · Ross Zwisler <hidden> · 2015-11-14
Re: [PATCH v2 08/11] dax: add support for fsync/sync · Dave Chinner <david@fromorbit.com> · 2015-11-16
Re: [PATCH v2 08/11] dax: add support for fsync/sync · Ross Zwisler <hidden> · 2015-11-17
[PATCH v2 09/11] ext2: add support for DAX fsync/msync · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 10/11] ext4: add support for DAX fsync/msync · Ross Zwisler <hidden> · 2015-11-14
[PATCH v2 11/11] xfs: add support for DAX fsync/msync · Ross Zwisler <hidden> · 2015-11-14
Re: [PATCH v2 11/11] xfs: add support for DAX fsync/msync · Dave Chinner <david@fromorbit.com> · 2015-11-16
Re: [PATCH v2 11/11] xfs: add support for DAX fsync/msync · Ross Zwisler <hidden> · 2015-11-17
Re: [PATCH v2 11/11] xfs: add support for DAX fsync/msync · Dave Chinner <david@fromorbit.com> · 2015-11-20
Re: [PATCH v2 00/11] DAX fsynx/msync support · Jan Kara <jack@suse.cz> · 2015-11-16
Re: [PATCH v2 00/11] DAX fsynx/msync support · Dan Williams <hidden> · 2015-11-16
Re: [PATCH v2 00/11] DAX fsynx/msync support · Ross Zwisler <hidden> · 2015-11-16

From: Jan Kara <jack@suse.cz>
Date: 2015-11-16 13:37:14
Also in: linux-fsdevel, linux-mm, linux-xfs, lkml, nvdimm

On Fri 13-11-15 18:32:40, Dan Williams wrote:

On Fri, Nov 13, 2015 at 4:43 PM, Andreas Dilger [off-list ref] wrote:

quoted

On Nov 13, 2015, at 5:20 PM, Dan Williams [off-list ref] wrote:

quoted

On Fri, Nov 13, 2015 at 4:06 PM, Ross Zwisler
[off-list ref] wrote:

quoted

Currently the PMEM driver doesn't accept REQ_FLUSH or REQ_FUA bios.  These
are sent down via blkdev_issue_flush() in response to a fsync() or msync()
and are used by filesystems to order their metadata, among other things.

When we get an msync() or fsync() it is the responsibility of the DAX code
to flush all dirty pages to media.  The PMEM driver then just has issue a
wmb_pmem() in response to the REQ_FLUSH to ensure that before we return all
the flushed data has been durably stored on the media.

Signed-off-by: Ross Zwisler <redacted>

Hmm, I'm not seeing why we need this patch.  If the actual flushing of
the cache is done by the core why does the driver need support
REQ_FLUSH?  Especially since it's just a couple instructions.  REQ_FUA
only makes sense if individual writes can bypass the "drive" cache,
but no I/O submitted to the driver proper is ever cached we always
flush it through to media.

If the upper level filesystem gets an error when submitting a flush
request, then it assumes the underlying hardware is broken and cannot
be as aggressive in IO submission, but instead has to wait for in-flight
IO to complete.

Upper level filesystems won't get errors when the driver does not
support flush.  Those requests are ended cleanly in
generic_make_request_checks().  Yes, the fs still needs to wait for
outstanding I/O to complete but in the case of pmem all I/O is
synchronous.  There's never anything to await when flushing at the
pmem driver level.

quoted

Since FUA/FLUSH is basically a no-op for pmem devices,
it doesn't make sense _not_ to support this functionality.

Seems to be a nop either way.  Given that DAX may lead to dirty data
pending to the device in the cpu cache that a REQ_FLUSH request will
not touch, its better to leave it all to the mm core to handle.  I.e.
it doesn't make sense to call the driver just for two instructions
(sfence + pcommit) when the mm core is taking on the cache flushing.
Either handle it all in the mm or the driver, not a mixture.

So I think REQ_FLUSH requests *must* end up doing sfence + pcommit because
e.g. journal writes going through block layer or writes done through
dax_do_io() must be on permanent storage once REQ_FLUSH request finishes
and the way driver does IO doesn't guarantee this, does it?

								Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help