RE: [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization
From: "Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>
Date: 2021-08-19 22:01:54
From: "Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>
Date: 2021-08-19 22:01:54
-----Original Message----- From: Jason Gunthorpe <jgg@nvidia.com> Sent: Wednesday, August 18, 2021 11:50 AM To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com> Cc: dledford@redhat.com; leon@kernel.org; linux-rdma@vger.kernel.org Subject: Re: [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization On Fri, Aug 13, 2021 at 05:25:49PM -0500, Tatyana Nikolova wrote:quoted
quoted
quoted
1. Software writing the valid bit in the WQE. 2. Software reading shadow memory (hw_tail) value.quoted
You are missing an ordered atomic on this read it looks likeHi Jason, Why do you think we need atomic ops in this case? We aren't trying to protect from multiple threads but CPU re-ordering of a write and a read.Which is what the atomics will do. Barriers are only appropriate when you can't add atomic markers to the actual data that needs ordering.
Hi Jason,
We aren't sure what you mean by atomic markers. We ran a few experiments with atomics, but none of the barriers we tried smp_mb__{before,after}_atomic(), smp_load_acquire() and smp_store_release() translates to a full memory barrier on X86.
Could you give us an example?
Thank you,
Tatyana