[dpdk-dev] 回复: [RFC PATCH v3 1/5] eal: add new definitions for wait scheme
From: Feifei Wang <hidden>
Date: 2021-10-14 03:08:25
-----邮件原件----- 发件人: Ananyev, Konstantin [off-list ref] 发送时间: Wednesday, October 13, 2021 11:04 PM 收件人: Feifei Wang [off-list ref]; Ruifeng Wang [off-list ref] 抄送: dev@dpdk.org; nd [off-list ref]; nd [off-list ref] 主题: RE: [dpdk-dev] [RFC PATCH v3 1/5] eal: add new definitions for wait schemequoted
[snip]quoted
quoted
diff --git a/lib/eal/include/generic/rte_pause.hb/lib/eal/include/generic/rte_pause.h index 668ee4a184..4e32107eca 100644--- a/lib/eal/include/generic/rte_pause.h +++ b/lib/eal/include/generic/rte_pause.h@@ -111,6 +111,84 @@ rte_wait_until_equal_64(volatile uint64_t*addr,uint64_t expected,quoted
while (__atomic_load_n(addr, memorder) != expected) rte_pause(); } + +/* + * Wait until a 16-bit *addr breaks the condition, with a relaxed +memory + * ordering model meaning the loads around this API can be reordered. + * + * @param addr + * A pointer to the memory location. + * @param mask + * A mask of value bits in interest + * @param expected + * A 16-bit expected value to be in the memory location. + * @param cond + * A symbol representing the condition (==, !=). + * @param memorder + * Two different memory orders that can be specified: + * __ATOMIC_ACQUIRE and __ATOMIC_RELAXED. These map to + * C++11 memory orders with the same names, see the C++11 +standard or + * the GCC wiki on atomic synchronization for detailed definition. + */Hmm, so now we have 2 APIs doing similar thing: rte_wait_until_equal_n() and rte_wait_event_n(). Can we probably unite them somehow? At least make rte_wait_until_equal_n() to use rte_wait_event_n()underneath.quoted
quoted
You are right. We plan to change rte_wait_until_equal API after this new scheme is achieved. And then, we will merge wait_unil into wait_event definition in the next new patch series.quoted
quoted
+#define rte_wait_event_16(addr, mask, expected, cond, memorder)\quoted
+do {\quoted
quoted
quoted
+ assert(memorder == __ATOMIC_ACQUIRE || memorder == +__ATOMIC_RELAXED); \And why user is not allowed to use __ATOMIC_SEQ_CST here?Actually this is just a load operation, and acquire here is enough to make sure 'load addr value' can be before other operations.quoted
BTW, if we expect memorder to always be a constant, might be better BUILD_BUG_ON()?If I understand correctly, you means we can replace 'assert' by'build_bug_on':quoted
RTE_BUILD_BUG_ON(memorder != __ATOMIC_ACQUIRE && memorder !=__ATOMIC_RELAXED);Yes, that was my thought. In that case I think we should be able to catch wrong memorder at compilation stage.quoted
quoted
quoted
+ \ + while ((__atomic_load_n(addr, memorder) & mask) cond expected)\quoted
+ rte_pause(); \ +} while (0)Two thoughts with these macros: 1. It is a goof practise to put () around macro parameters in the macrobody.quoted
quoted
Will save from a lot of unexpected troubles. 2. I think these 3 macros can be united into one. Something like: #define rte_wait_event(addr, mask, expected, cond, memorder) do {\ typeof (*(addr)) val = __atomic_load_n((addr), (memorder)); \ if ((val & (typeof(val))(mask)) cond (typeof(val))(expected)) \ break; \ rte_pause(); \ } while (1);For this point, I think it is due to different size need to use different assembly instructions in arm architecture. For example, load 16 bits instruction is "ldxrh %w[tmp], [%x[addr]" load 32 bits instruction is " ldxr %w[tmp], [%x[addr]" load 64 bits instruction is " ldxr %x[tmp], [%x[addr] "Ok, but it could be then something like that for arm specific code: if (sizeof(val) == sizeof(uint16_t)) \ __LOAD_EXC_16(...); \ else if (sizeof(val) == sizeof(uint32_t)) \ __LOAD_EXC_32(...); \ else if (sizeof(val) == sizeof(uint64_t)) \ __LOAD_EXC_64(...); \ ...
I thinks we should use "addr" as judgement:
rte_wait_event(addr, mask, expected, cond, memorder)
if (sizeof(*addr)) == sizeof(uint16_t)
uint16_t value \
assert(memorder == __ATOMIC_ACQUIRE || memorder == __ATOMIC_RELAXED); \
__LOAD_EXC_16(addr, value, memorder) \
if ((value & mask) cond expected) { \
__SEVL() \
do { \
__WFE() \
__LOAD_EXC_16(addr, value, memorder) \
} while ((value & mask) cond expected); \
}
if (sizeof(*addr)) == sizeof(uint32_t)
..........
if (sizeof(*addr)) == sizeof(uint64_t)
...........
quoted
And for consistency, we also use 3 APIs in generic path.Honestly, even one multi-line macro doesn't look nice. Having 3 identical ones looks even worse.