Re: [PATCH v3 2/5] ring: add a non-blocking implementation
From: Jerin Jacob Kollanukkaran <hidden>
Date: 2019-01-28 13:34:33
On Fri, 2019-01-25 at 17:21 +0000, Eads, Gage wrote:
quoted
-----Original Message----- From: Ola Liljedahl [mailto:Ola.Liljedahl@arm.com] Sent: Wednesday, January 23, 2019 4:16 AM To: Eads, Gage <redacted>; dev@dpdk.org Cc: olivier.matz@6wind.com; stephen@networkplumber.org; nd [off-list ref]; Richardson, Bruce [off-list ref]; arybchenko@solarflare.com; Ananyev, Konstantin [off-list ref] Subject: Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation s.quoted
quoted
You can tell this code was written when I thought x86-64 was the only viable target :). Yes, you are correct. With regards to using __atomic intrinsics, I'm planning on taking a similar approach to the functions duplicated in rte_ring_generic.h and rte_ring_c11_mem.h: one version that uses rte_atomic functions (and thus stricter memory ordering) and one that uses __atomic intrinsics (and thus can benefit from more relaxed memory ordering).What's the advantage of having two different implementations? What is the disadvantage? The existing ring buffer code originally had only the "legacy" implementation which was kept when the __atomic implementation was added. The reason claimed was that some older compilers for x86 do not support GCC __atomic builtins. But I thought there was consensus that new functionality could have only __atomic implementations.When CONFIG_RTE_RING_USE_C11_MEM_MODEL was introduced, it was left disabled for thunderx[1] for performance reasons. Assuming that hasn't changed, the advantage to having two versions is to best support all of DPDK's platforms. The disadvantage is of course duplicated code and the additional maintenance burden. That said, if the thunderx maintainers are ok with it, I'm certainly
The ring code was so fundamental building block for DPDK, there was difference in performance and there was already legacy code so introducing C11_MEM_MODEL was justified IMO. For the nonblocking implementation, I am happy to test with three ARM64 microarchitectures and share the result with C11_MEM_MODEL vs non C11_MEM_MODLE performance. We may need to consider PPC also here. So IMO, based on the overall performance result may be can decide the new code direction.