Thread (28 messages) 28 messages, 2 authors, 10d ago

Re: [PATCH v19 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.

From: Ratheesh Kannoth <hidden>
Date: 2026-06-08 02:33:02
Also in: lkml

On 2026-06-05 at 12:02:45, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
Replace the file-scope static npc_priv with a kcalloc'd struct filled
from hardware bank/subbank geometry at init (num_banks is no longer a
const compile-time constant; drop init_done and use a non-NULL
npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
through the CN20K NPC code paths, extend teardown to kfree the root
struct on failure and in npc_cn20k_deinit, and adjust MCAM section
setup to use the discovered subbank count.

Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
and use the allocated backing store consistently when computing deltas
(including the counter rollover compare).

Signed-off-by: Ratheesh Kannoth <redacted>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260605063245.3553861-1-rkannoth%40marvell.com says
:
quoted
@@ -397,6 +398,10 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
 	debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
 			    npc_priv, &npc_vidx2idx_map_fops);

+	dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
+	if (!dstats)
+		return -ENOMEM;
+
 	debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
 			    &npc_mcam_dstats_fops);
What happens when this devm_kzalloc() fails?
The only caller is rvu_dbg_npc_init() in
drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c, which invokes
npc_cn20k_debugfs_init() as a void call and discards the return value.
By the time the new -ENOMEM path is taken, mcam_layout, mcam_default and
vidx2idx have already been registered via debugfs_create_file(), and
dstats, mismatch, idx2vidx and defrag are then skipped. Three of the
four skipped files (mismatch, idx2vidx, defrag) have no relationship to
dstats and seem to be dropped only as a side effect of where the
allocation was placed.
Would it be better to either move the dstats allocation up to
npc_priv_init() (or to the very start of npc_cn20k_debugfs_init() before
any debugfs_create_file()), or alternatively gate just the dstats
debugfs_create_file() on the allocation success while still creating the
unrelated files? In either case, should the caller also be updated to
honor the return value?
if "dtstats" debugfs is missing, it is still fine as this debugfs entry is used
only for live debugging.
quoted
 #include "cn20k/reg.h"
 #include "rvu_npc_fs.h"

-static struct npc_priv_t npc_priv = {
-	.num_banks = MAX_NUM_BANKS,
-};
+static struct npc_priv_t *npc_priv;
With the previous code, init_done was set to true only at the very end
of npc_cn20k_init(), after npc_setup_mcam_section() succeeded, so
'init_done == true' implied both the struct fields were populated and
the hardware had been programmed.
The new gate is 'npc_priv != NULL', but npc_priv is assigned by
kcalloc() near the start of npc_priv_init(), well before bank_depth, kw,
sb[], the xarrays, the mutex and the list head are populated. Does
'npc_priv != NULL' still carry the same 'fully initialized' meaning that
init_done used to?
npc_priv is initialized in npc_priv_init() during probe(). if any of the initialization
fails, npc_priv is freed and set to NULL, and probe() returns with error.
quoted
@@ -4251,8 +4249,8 @@ int rvu_mbox_handler_npc_get_pfl_info(struct rvu *rvu, struct msg_req *req,
 		return -EOPNOTSUPP;
 	}

-	rsp->kw_type = npc_priv.kw;
-	rsp->x4_slots = npc_priv.bank_depth;
+	rsp->kw_type = npc_priv->kw;
+	rsp->x4_slots = npc_priv->bank_depth;
 	return 0;
 }
Should this handler also gain an 'if (!npc_priv)' guard?
The other public entry points were updated to check for NULL:
 npc_cn20k_vidx2idx()
 npc_cn20k_idx2vidx()
 npc_cn20k_dft_rules_idx_get()
 npc_cn20k_dft_rules_alloc()
 npc_cn20k_dft_rules_free()
but rvu_mbox_handler_npc_get_pfl_info() now dereferences npc_priv->kw
and npc_priv->bank_depth without a NULL check. Is that intentional, or
is the asymmetry just an oversight?
If npc_priv initialization fails, probe itself will fail. So these handlers are invoked much later by PF, which
will be active only after AF is fully up.
quoted
-	xa_destroy(&npc_priv.xa_sb_used);
[ ... ]
quoted
+	if (!npc_priv)
+		return;

-	for (i = 0; i < npc_priv.pf_cnt; i++)
-		xa_destroy(&npc_priv.xa_pf2idx_map[i]);
+	xa_destroy(&npc_priv->xa_sb_used);
[ ... ]
quoted
+	npc_defrag_list_clear();
+	kfree(npc_priv);
+	npc_priv = NULL;
 }
Previously the storage was a file-scope static, so teardown only flipped
init_done. Now the storage itself is freed and the pointer cleared
without any synchronization (no synchronize_rcu(), no flush of the mbox
workqueue, no lock).
If a racing mbox handler such as rvu_mbox_handler_npc_get_pfl_info() (or
any of the npc_cn20k_dft_rules_* helpers) has already passed its
'if (!npc_priv) return;' check but has not yet read a field, can it now
dereference freed memory?
The probe ordering may make this hard to reach today, but does the new
free-and-NULL pattern introduce a use-after-free window that the old
init_done flag did not have?
rvu_remove() disables mbox first, so this case wont happen.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help