Thread (7 messages) 7 messages, 5 authors, 2026-02-24

Re: [PATCH net-next] net: ethtool: add COALESCE_RX_CQE_FRAMES/NSECS parameters

From: Tariq Toukan <hidden>
Date: 2026-02-24 12:01:42
Also in: linux-doc, linux-hyperv, lkml


On 22/02/2026 23:23, Haiyang Zhang wrote:
quoted hunk ↗ jump to hunk
From: Haiyang Zhang <haiyangz@microsoft.com>

Add two parameters for drivers supporting Rx CQE Coalescing.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Time out value in nanoseconds after the first packet arrival in a
coalesced CQE to be sent.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
  Documentation/netlink/specs/ethtool.yaml       |  8 ++++++++
  Documentation/networking/ethtool-netlink.rst   | 10 ++++++++++
  include/linux/ethtool.h                        |  6 +++++-
  include/uapi/linux/ethtool_netlink_generated.h |  2 ++
  net/ethtool/coalesce.c                         | 14 +++++++++++++-
  5 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
index 0a2d2343f79a..951d98f6bb12 100644
--- a/Documentation/netlink/specs/ethtool.yaml
+++ b/Documentation/netlink/specs/ethtool.yaml
@@ -861,6 +861,12 @@ attribute-sets:
          name: tx-profile
          type: nest
          nested-attributes: profile
+      -
+        name: rx-cqe-frames
+        type: u32
+      -
+        name: rx-cqe-nsecs
+        type: u32
  
    -
      name: pause-stat
@@ -2244,6 +2250,8 @@ operations:
              - tx-aggr-time-usecs
              - rx-profile
              - tx-profile
+            - rx-cqe-frames
+            - rx-cqe-nsecs
        dump: *coalesce-get-op
      -
        name: coalesce-set
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index af56c304cef4..a3e78b69fd07 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -1072,6 +1072,8 @@ Kernel response contents:
    ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS``    u32     time (us), aggr, Tx
    ``ETHTOOL_A_COALESCE_RX_PROFILE``            nested  profile of DIM, Rx
    ``ETHTOOL_A_COALESCE_TX_PROFILE``            nested  profile of DIM, Tx
+  ``ETHTOOL_A_COALESCE_RX_CQE_FRAMES``         u32     max packets, Rx CQE
+  ``ETHTOOL_A_COALESCE_RX_CQE_NSECS``          u32     delay (ns), Rx CQE
    ===========================================  ======  =======================
  
  Attributes are only included in reply if their value is not zero or the
@@ -1105,6 +1107,12 @@ well with frequent small-sized URBs transmissions.
  to DIM parameters, see `Generic Network Dynamic Interrupt Moderation (Net DIM)
  <https://www.kernel.org/doc/Documentation/networking/net_dim.rst>`_.
  
+Rx CQE coalescing allows multiple received packets to be coalesced into a single
+Completion Queue Entry (CQE). ``ETHTOOL_A_COALESCE_RX_CQE_FRAMES`` describes the
+maximum number of frames that can be coalesced into a CQE.
+``ETHTOOL_A_COALESCE_RX_CQE_NSECS`` describes max time in nanoseconds after the
+first packet arrival in a coalesced CQE to be sent.
+
I am trying to understand how generic this feature/API is.
Can you please elaborate on the feature you want to configure here?

A single CQE to describe several packets?
What is the price? What per-packet information/hw offloads do you lose 
in the process?

For comparison, in mlx5 we have RX CQE compression, which can be applied 
on multiple near-identical completions that share/match several fields. 
Still, there is a per-packet mini-cqe with distinctive per-packet fields 
like csum.
quoted hunk ↗ jump to hunk
  COALESCE_SET
  ============
  
@@ -1143,6 +1151,8 @@ Request contents:
    ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS``    u32     time (us), aggr, Tx
    ``ETHTOOL_A_COALESCE_RX_PROFILE``            nested  profile of DIM, Rx
    ``ETHTOOL_A_COALESCE_TX_PROFILE``            nested  profile of DIM, Tx
+  ``ETHTOOL_A_COALESCE_RX_CQE_FRAMES``         u32     max packets, Rx CQE
+  ``ETHTOOL_A_COALESCE_RX_CQE_NSECS``          u32     delay (ns), Rx CQE
    ===========================================  ======  =======================
  
  Request is rejected if it attributes declared as unsupported by driver (i.e.
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 798abec67a1b..25ccd2d5d4dc 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -332,6 +332,8 @@ struct kernel_ethtool_coalesce {
  	u32 tx_aggr_max_bytes;
  	u32 tx_aggr_max_frames;
  	u32 tx_aggr_time_usecs;
+	u32 rx_cqe_frames;
+	u32 rx_cqe_nsecs;
  };
  
  /**
@@ -380,7 +382,9 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
  #define ETHTOOL_COALESCE_TX_AGGR_TIME_USECS	BIT(26)
  #define ETHTOOL_COALESCE_RX_PROFILE		BIT(27)
  #define ETHTOOL_COALESCE_TX_PROFILE		BIT(28)
-#define ETHTOOL_COALESCE_ALL_PARAMS		GENMASK(28, 0)
+#define ETHTOOL_COALESCE_RX_CQE_FRAMES		BIT(29)
+#define ETHTOOL_COALESCE_RX_CQE_NSECS		BIT(30)
+#define ETHTOOL_COALESCE_ALL_PARAMS		GENMASK(30, 0)
  
  #define ETHTOOL_COALESCE_USECS						\
  	(ETHTOOL_COALESCE_RX_USECS | ETHTOOL_COALESCE_TX_USECS)
diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
index 556a0c834df5..efc6e4ade77b 100644
--- a/include/uapi/linux/ethtool_netlink_generated.h
+++ b/include/uapi/linux/ethtool_netlink_generated.h
@@ -371,6 +371,8 @@ enum {
  	ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS,
  	ETHTOOL_A_COALESCE_RX_PROFILE,
  	ETHTOOL_A_COALESCE_TX_PROFILE,
+	ETHTOOL_A_COALESCE_RX_CQE_FRAMES,
+	ETHTOOL_A_COALESCE_RX_CQE_NSECS,
  
  	__ETHTOOL_A_COALESCE_CNT,
  	ETHTOOL_A_COALESCE_MAX = (__ETHTOOL_A_COALESCE_CNT - 1)
diff --git a/net/ethtool/coalesce.c b/net/ethtool/coalesce.c
index 3e18ca1ccc5e..349bb02c517a 100644
--- a/net/ethtool/coalesce.c
+++ b/net/ethtool/coalesce.c
@@ -118,6 +118,8 @@ static int coalesce_reply_size(const struct ethnl_req_info *req_base,
  	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_MAX_BYTES */
  	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_MAX_FRAMES */
  	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_TIME_USECS */
+	       nla_total_size(sizeof(u32)) +	/* _RX_CQE_FRAMES */
+	       nla_total_size(sizeof(u32)) +	/* _RX_CQE_NSECS */
  	       total_modersz * 2;		/* _{R,T}X_PROFILE */
  }
  
@@ -269,7 +271,11 @@ static int coalesce_fill_reply(struct sk_buff *skb,
  	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES,
  			     kcoal->tx_aggr_max_frames, supported) ||
  	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS,
-			     kcoal->tx_aggr_time_usecs, supported))
+			     kcoal->tx_aggr_time_usecs, supported) ||
+	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_CQE_FRAMES,
+			     kcoal->rx_cqe_frames, supported) ||
+	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_CQE_NSECS,
+			     kcoal->rx_cqe_nsecs, supported))
  		return -EMSGSIZE;
  
  	if (!req_base->dev || !req_base->dev->irq_moder)
@@ -338,6 +344,8 @@ const struct nla_policy ethnl_coalesce_set_policy[] = {
  	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES] = { .type = NLA_U32 },
  	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES] = { .type = NLA_U32 },
  	[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS] = { .type = NLA_U32 },
+	[ETHTOOL_A_COALESCE_RX_CQE_FRAMES] = { .type = NLA_U32 },
+	[ETHTOOL_A_COALESCE_RX_CQE_NSECS] = { .type = NLA_U32 },
  	[ETHTOOL_A_COALESCE_RX_PROFILE] =
  		NLA_POLICY_NESTED(coalesce_profile_policy),
  	[ETHTOOL_A_COALESCE_TX_PROFILE] =
@@ -570,6 +578,10 @@ __ethnl_set_coalesce(struct ethnl_req_info *req_info, struct genl_info *info,
  			 tb[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES], &mod);
  	ethnl_update_u32(&kernel_coalesce.tx_aggr_time_usecs,
  			 tb[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS], &mod);
+	ethnl_update_u32(&kernel_coalesce.rx_cqe_frames,
+			 tb[ETHTOOL_A_COALESCE_RX_CQE_FRAMES], &mod);
+	ethnl_update_u32(&kernel_coalesce.rx_cqe_nsecs,
+			 tb[ETHTOOL_A_COALESCE_RX_CQE_NSECS], &mod);
  
  	if (dev->irq_moder && dev->irq_moder->profile_flags & DIM_PROFILE_RX) {
  		ret = ethnl_update_profile(dev, &dev->irq_moder->rx_profile,
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help