RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2011-05-18 21:30:18
Also in:
linux-scsi
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2011-05-18 21:30:18
Also in:
linux-scsi
On Wed, 2011-05-18 at 09:35 -0600, Moore, Eric wrote:
I worked the original defect a couple months ago, and Kashyap is now getting around to posting my patch's. This original defect has nothing to do with PPC64. The original problem was only on x86. It only became a problem on PPC64 when I tried to fix the original x86 issue by copying the writeq code from the linux headers, then it broke PPC64. I doubt that broken patch was ever posted. Anyways, back to the original defect. The reason it because a problem for x86 is because the kernel headers had a implementation of writeq in the arch/x86 headers, which means our internal implementation of writeq is not being used. The writeq implementation in the kernel is total wrong for arch/x86 because it doesn't not have spin locks, and if two processor simultaneously doing two separate 32bit pci writes, then what is received by controller firmware is out of order. This change occurs between Red Hat RHEL5 and RHEL6. In RHEL5, this writeq was not implemented in arch/x86 headers, and our driver internal implementation of write was used.
You may also want to look at Milton's comments, it looks like the way you do init_completion followed immediately by wait_completion is racy. You should init the completion before you do the IO that will eventually trigger complete() to be called. Cheers, Ben.