Thread (109 messages) 109 messages, 19 authors, 2011-01-14

Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush

From: Mike Snitzer <hidden>
Date: 2010-08-27 13:49:40
Also in: dm-devel, linux-fsdevel, linux-ide, linux-scsi, lkml

On Fri, Aug 27 2010 at  5:47am -0400,
Kiyoshi Ueda [off-list ref] wrote:
Hi Mike,

On 08/26/2010 12:28 AM +0900, Mike Snitzer wrote:
quoted
Kiyoshi Ueda [off-list ref] wrote:
quoted
Anyway, as you said, the flush error handling of dm-mpath is already
broken if data loss really happens on any storage used by dm-mpath.
Although it's a serious issue and quick fix is required, I think
you may leave the old behavior in your patch-set, since it's
a separate issue.
I'm not seeing where anything is broken with current mpath.  If a
multipathed LUN is WCE=1 then it should be fair to assume the cache is
mirrored or shared across ports.  Therefore retrying the SYNCHRONIZE
CACHE is needed.

Do we still have fear that SYNCHRONIZE CACHE can silently drop data?
Seems unlikely especially given what Tejun shared from SBC.
Do we have any proof to wipe that fear?

If retrying on flush failure is safe on all storages used with multipath
(e.g. SCSI, CCISS, DASD, etc), then current dm-mpath should be fine in
the real world.
But I'm afraid if there is a storage where something like below can happen:
    - a flush command is returned as error to mpath because a part of
      cache has physically broken at the time or so, then that part of
      data loses and the size of the cache is shrunk by the storage.
    - mpath retries the flush command using other path.
    - the flush command is returned as success to mpath.
    - mpath passes the result, success, to upper layer, but some of
      the data already lost.
That does seem like a valid concern.  But I'm not seeing why its unique
to SYNCHRONIZE CACHE.  Any IO that fails on the target side should be
passed up once the error gets to DM.

Mike
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help