Re: [PATCH v2 1/2] scsi: fix race between simultaneous decrements of ->host_failed
From: Dan Williams <hidden>
Date: 2016-06-01 03:21:17
Also in:
linux-scsi
From: Dan Williams <hidden>
Date: 2016-06-01 03:21:17
Also in:
linux-scsi
On Tue, May 31, 2016 at 1:38 AM, Wei Fang [off-list ref] wrote:
sas_ata_strategy_handler() adds the works of the ata error handler
to system_unbound_wq. This workqueue asynchronously runs work items,
so the ata error handler will be performed concurrently on different
CPUs. In this case, ->host_failed will be decreased simultaneously in
scsi_eh_finish_cmd() on different CPUs, and become abnormal.
It will lead to permanently inequal between ->host_failed and
->host_busy, and scsi error handler thread won't become running.
IO errors after that won't be handled forever.
Use atomic type for ->host_failed to fix this race.
This fixes the problem introduced in
commit 50824d6c5657 ("[SCSI] libsas: async ata-eh").
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Wei Fang <redacted>Acked-by: Dan Williams <redacted>