Thread (113 messages) 113 messages, 8 authors, 2021-11-04

[dpdk-dev] 回复: [RFC PATCH v3 1/5] eal: add new definitions for wait scheme

From: Feifei Wang <hidden>
Date: 2021-10-14 03:08:25

-----邮件原件-----
发件人: Ananyev, Konstantin [off-list ref]
发送时间: Wednesday, October 13, 2021 11:04 PM
收件人: Feifei Wang [off-list ref]; Ruifeng Wang
[off-list ref]
抄送: dev@dpdk.org; nd [off-list ref]; nd [off-list ref]
主题: RE: [dpdk-dev] [RFC PATCH v3 1/5] eal: add new definitions for wait
scheme
quoted
[snip]
quoted
quoted
diff --git a/lib/eal/include/generic/rte_pause.h
b/lib/eal/include/generic/rte_pause.h
index 668ee4a184..4e32107eca 100644
--- a/lib/eal/include/generic/rte_pause.h
+++ b/lib/eal/include/generic/rte_pause.h
@@ -111,6 +111,84 @@ rte_wait_until_equal_64(volatile uint64_t
*addr,
uint64_t expected,
quoted
 	while (__atomic_load_n(addr, memorder) != expected)
 		rte_pause();
 }
+
+/*
+ * Wait until a 16-bit *addr breaks the condition, with a relaxed
+memory
+ * ordering model meaning the loads around this API can be reordered.
+ *
+ * @param addr
+ *  A pointer to the memory location.
+ * @param mask
+ *  A mask of value bits in interest
+ * @param expected
+ *  A 16-bit expected value to be in the memory location.
+ * @param cond
+ *  A symbol representing the condition (==, !=).
+ * @param memorder
+ *  Two different memory orders that can be specified:
+ *  __ATOMIC_ACQUIRE and __ATOMIC_RELAXED. These map to
+ *  C++11 memory orders with the same names, see the C++11
+standard or
+ *  the GCC wiki on atomic synchronization for detailed definition.
+ */
Hmm, so now we have 2 APIs doing similar thing:
rte_wait_until_equal_n() and rte_wait_event_n().
Can we probably unite them somehow?
At least make rte_wait_until_equal_n() to use rte_wait_event_n()
underneath.
quoted
quoted
You are right. We plan to change rte_wait_until_equal API after this
new scheme is achieved.  And then, we will merge wait_unil into
wait_event definition in the next new patch series.
quoted
quoted
+#define rte_wait_event_16(addr, mask, expected, cond, memorder)
		       \
quoted
+do {
	       \
quoted
quoted
quoted
+	assert(memorder == __ATOMIC_ACQUIRE || memorder ==
+__ATOMIC_RELAXED);  \
And why user is not allowed to use __ATOMIC_SEQ_CST here?
Actually this is just a load operation, and acquire here is enough to
make sure 'load addr value' can be before other operations.
quoted
BTW, if we expect memorder to always be a constant, might be better
BUILD_BUG_ON()?
If I understand correctly, you means we can replace 'assert' by
'build_bug_on':
quoted
RTE_BUILD_BUG_ON(memorder != __ATOMIC_ACQUIRE && memorder
!=__ATOMIC_RELAXED);
Yes, that was my thought.
In that case I think we should be able to catch wrong memorder at compilation
stage.
quoted
quoted
quoted
+									       \
+	while ((__atomic_load_n(addr, memorder) & mask) cond expected)
	       \
quoted
+		rte_pause();						       \
+} while (0)
Two thoughts with these macros:
1. It is a goof practise to put () around macro parameters in the macro
body.
quoted
quoted
Will save from a lot of unexpected troubles.
2. I think these 3 macros can be united into one.
Something like:

#define rte_wait_event(addr, mask, expected, cond, memorder) do {\
        typeof (*(addr)) val = __atomic_load_n((addr), (memorder)); \
        if ((val & (typeof(val))(mask)) cond (typeof(val))(expected)) \
                break; \
        rte_pause(); \
} while (1);
For this point, I think it is due to different size need to use
different assembly instructions in arm architecture. For example, load
16 bits instruction is "ldxrh %w[tmp], [%x[addr]"
load 32 bits instruction is " ldxr %w[tmp], [%x[addr]"
load 64 bits instruction is " ldxr %x[tmp], [%x[addr] "
Ok, but it could be then something like that for arm specific code:
if (sizeof(val) == sizeof(uint16_t)) \
	__LOAD_EXC_16(...); \
else if (sizeof(val) == sizeof(uint32_t)) \
	__LOAD_EXC_32(...); \
else if (sizeof(val) == sizeof(uint64_t)) \
	__LOAD_EXC_64(...); \
...
I thinks we should use "addr" as judgement:

rte_wait_event(addr, mask, expected, cond, memorder)
if (sizeof(*addr)) == sizeof(uint16_t) 
	uint16_t value                                                         \
	assert(memorder == __ATOMIC_ACQUIRE || memorder == __ATOMIC_RELAXED);  \
	__LOAD_EXC_16(addr, value, memorder)				       \
	if ((value & mask) cond expected) {				       \
		__SEVL()						       \
		do {							       \
			__WFE()						       \
			__LOAD_EXC_16(addr, value, memorder)		       \
		} while ((value & mask) cond expected);			       \
	}	
if (sizeof(*addr)) == sizeof(uint32_t) 
	..........
if (sizeof(*addr)) == sizeof(uint64_t) 
	...........
quoted
And for consistency, we also use 3 APIs in generic path.
Honestly, even one multi-line macro doesn't look nice.
Having 3 identical ones looks even worse.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help