Thread (28 messages) 28 messages, 2 authors, 13d ago

Re: [PATCH v19 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements

From: Ratheesh Kannoth <hidden>
Date: 2026-06-08 02:27:01
Also in: lkml

On 2026-06-05 at 12:02:38, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
Improve MCAM visibility and field debugging for CN20K NPC.

- Extend "mcam_layout" to show enabled (+) or disabled state per entry
  so status can be verified without parsing the full "mcam_entry" dump.
- Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
  since the prior read by comparing hardware counters to a per-entry
  software baseline and advancing that baseline after each read (hardware
  counters are not cleared).
- Add "mismatch" debugfs entry: lists MCAM entries that are enabled
  but not explicitly allocated, helping diagnose allocation/field issues.

Signed-off-by: Ratheesh Kannoth <redacted>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260605063245.3553861-1-rkannoth%40marvell.com says

t
quoted
+			stats = rvu_read64(rvu, blkaddr,
+					   NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
+			if (!stats)
+				continue;
+			if (stats == dstats[bank][idx])
+				continue;
+
+			if (stats < dstats[bank][idx])
+				dstats[bank][idx] = 0;
Can dstats[bank][idx] overflow this static array?
No. This array is allocated with Max values.
MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS
The header comment asserts that MAX_NUM_BANKS, MAX_NUM_SUB_BANKS and
MAX_SUBBANK_DEPTH are hard limits "preventing any possibility of
out-of-bounds access", but npc_priv_init() does not enforce that
contract on bank_depth itself. It only checks num_banks, num_subbanks,
and subbank_depth = bank_depth / num_subbanks.
bank_depth maximum across SoC is 8192. We will add a check for num_subbnks as well
as part of hardening series to net-next once this patch series is merged.
Will add a check on bank_depth in npc_priv_init() in hardening patch series to
net-next after this series is merged.
Is the dstats baseline correct on the first read, and after an MCAM slot
is freed and reallocated?
The commit message describes dstats as printing "hit deltas since the
prior read by comparing hardware counters to a per-entry software
baseline and advancing that baseline after each read". The baseline
dstats[bank][idx] is keyed only by physical hardware slot.
On the first read after module load, dstats[bank][idx] is 0, so:
delta = stats - dstats[bank][idx];
prints the cumulative counter rather than a delta.
When an MCAM slot is freed and reallocated to a different rule or PF
(and the hardware counter is preserved across moves, as
npc_defrag_move_vdx_to_free does via rvu_write64(... STAT_EXT ...,
stats)), the retained baseline mixes counts that belonged to the
previous owner into the delta attributed to the new PF.
Should the baseline be cleared inside npc_cn20k_enable_mcam_entry() on
each enable/disable transition, in addition to whatever happens on read?
Is reading dstats meant to be destructive?
Each read advances dstats[bank][idx] = stats, so two concurrent readers
(for example a monitoring daemon and a human admin) race for events:
whichever reads first advances the baseline, and the other sees zero
deltas for entries it would otherwise have observed. Most observability
files under debugfs/procfs/sysfs are passive snapshots so multiple tools
can scrape them concurrently. Would a snapshot file plus userspace delta
computation, or a separate write-to-clear sibling, fit better here?
"dstats" is for live debugging in field, where user wants to know which NPC
mcam entry is being hit. This debugfs entry helps to identify the mcam entry.
If mcam entry is allocated/deallocated/allocated again, first time 'cat'ing debugfs
entry, may show wrong delta. But next time, it will show the correct value as s/w
baseline array would have updated to the hw stats value correctly.
quoted
+
+			pf = 0xFFFF;
+			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+			if (map)
+				pf = xa_to_value(map);
+
+			delta = stats - dstats[bank][idx];
+
+			snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
+				 mcam_idx, pf, delta);
+			seq_puts(s, buff);
+
+			dstats[bank][idx] = stats;
+		}
+	}
+
+	mutex_unlock(&stats_lock);
+	return 0;
+}
+
+/*  "%u\t%#04x\t%llu\n" needs less than 64 characters to print */
+#define TOTAL_SZ (MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH * 64)
+DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_dstats, TOTAL_SZ);
Is the up-front 1 MiB allocation per open() necessary here?
TOTAL_SZ evaluates to MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
MAX_SUBBANK_DEPTH * 64 = 1,048,576 bytes, and single_open_size() asks
seq_file to preallocate that buffer regardless of how much the file
actually emits. dstats only emits lines for entries with changed
counters, and mismatch is typically a few lines.
The size also does not scale to the actual hardware geometry; it uses
the MAX constants instead of npc_priv->num_banks * npc_priv->bank_depth.
Could the standard seq_file iterator pattern (start/next/stop/show via
DEFINE_SHOW_ATTRIBUTE) be used instead, so the buffer grows on demand.
BUffer is pre allocated to the maximum required.
quoted
 	}
@@ -4513,10 +4527,15 @@ static int npc_priv_init(struct rvu *rvu)
 		return -EINVAL;
 	}

-	npc_priv.num_subbanks = num_subbanks;
-
 	subbank_depth =	bank_depth / num_subbanks;
+	if (subbank_depth > MAX_SUBBANK_DEPTH) {
+		dev_err(rvu->dev,
+			"Invalid subbank depth %u\n",
+			subbank_depth);
+		return -EINVAL;
+	}

+	npc_priv.num_subbanks = num_subbanks;
 	npc_priv.bank_depth = bank_depth;
 	npc_priv.subbank_depth = subbank_depth;
Tying back to the dstats overflow question above: bank_depth itself is
not bounded here. With hardware reporting bank_depth = 8193 and
num_subbanks = 32, integer division gives subbank_depth = 256, which
passes this check, but bank_depth still exceeds MAX_NUM_SUB_BANKS *
MAX_SUBBANK_DEPTH = 8192. Should there be an explicit check on
bank_depth, or a check that bank_depth == subbank_depth * num_subbanks,
to keep the en_map bitmap and dstats[][] within their declared sizes?
bank_depth is less than MAX_BANK_DEPTH in all Soc. WIll add a check in
next patch series (hardening series to check against bank_depth <= MAX_BANK_DEPTH)
int npc_priv_init() (During probe())
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help