Re: [PATCH 1/4] add vector PMD RX for FVL
From: Bruce Richardson <hidden>
Date: 2015-09-29 14:27:50
On Mon, Sep 28, 2015 at 01:05:24AM +0800, Zhe Tao wrote:
The vPMD RX function uses the multi-buffer and SSE instructions to accelerate the RX speed, but now the pktype cannot be supported by the vPMD RX, because it will decrease the performance heavily. Signed-off-by: Zhe Tao <redacted> --- config/common_bsdapp | 2 + config/common_linuxapp | 2 + drivers/net/i40e/Makefile | 1 + drivers/net/i40e/base/i40e_type.h | 3 + drivers/net/i40e/i40e_rxtx.c | 28 ++- drivers/net/i40e/i40e_rxtx.h | 20 +- drivers/net/i40e/i40e_rxtx_vec.c | 484 ++++++++++++++++++++++++++++++++++++++ 7 files changed, 535 insertions(+), 5 deletions(-) create mode 100644 drivers/net/i40e/i40e_rxtx_vec.c
<snip>
+ + /* vPMD receive routine, now only accept (nb_pkts == RTE_I40E_VPMD_RX_BURST) + * in one loop + * + * Notice: + * - nb_pkts < RTE_I40E_VPMD_RX_BURST, just return no packet
I don't think this comment matches the implementation below. I think you are allowed to request bursts as small as RTE_I40E_DESCS_PER_LOOP.
+ * - nb_pkts > RTE_I40E_VPMD_RX_BURST, only scan RTE_I40E_VPMD_RX_BURST
+ * numbers of DD bits
+
+ */
+static inline uint16_t
+_recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts, uint8_t *split_packet)
+{
+ volatile union i40e_rx_desc *rxdp;
+ struct i40e_rx_entry *sw_ring;
+ uint16_t nb_pkts_recd;
+ int pos;
+ uint64_t var;
+ __m128i shuf_msk;
+
+ __m128i crc_adjust = _mm_set_epi16(
+ 0, 0, 0, /* ignore non-length fields */
+ -rxq->crc_len, /* sub crc on data_len */
+ 0, /* ignore high-16bits of pkt_len */
+ -rxq->crc_len, /* sub crc on pkt_len */
+ 0, 0 /* ignore pkt_type field */
+ );
+ __m128i dd_check, eop_check;
+
+ /* nb_pkts shall be less equal than RTE_I40E_MAX_RX_BURST */
+ nb_pkts = RTE_MIN(nb_pkts, RTE_I40E_MAX_RX_BURST);
+
+ /* nb_pkts has to be floor-aligned to RTE_I40E_DESCS_PER_LOOP */
+ nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_I40E_DESCS_PER_LOOP);/Bruce