Thread (28 messages) 28 messages, 2 authors, 14d ago

Re: [PATCH v19 net-next 1/9] octeontx2-af: Enforce single RVU AF probe

From: Ratheesh Kannoth <hidden>
Date: 2026-06-08 02:17:51
Also in: lkml

On 2026-06-05 at 12:02:37, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
There is only one admin-function PCI device per system.
Reject any additional AF probe with -EBUSY so the driver model matches
hardware and automated reviewers can rely on a single bound instance.

Signed-off-by: Ratheesh Kannoth <redacted>
https://sashiko.dev/#/patchset/20260605063245.3553861-1-rkannoth%40marvell.com says
quoted
 static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct device *dev = &pdev->dev;
 	struct rvu *rvu;
 	int    err;

+	if (atomic_cmpxchg(&device_bound, 0, 1) != 0) {
+		dev_warn(dev, "Only one af device is supported.\n");
+		return -EBUSY;
+	}
+
Since this atomic_cmpxchg() sets device_bound to 1, is there a path that sets
it back to 0?
Looking at rvu_probe(), if a failure occurs later in the probe process
(such as -ENOMEM or resource allocation failures), it jumps to the cleanup
labels but does not clear the device_bound flag. Does this permanently
prevent driver rebinding or retry on probe deferral?
Additionally, when the driver is successfully unbound via rvu_remove(),
the flag doesn't appear to be cleared there either, which would cause
subsequent manual binds to fail with -EBUSY.
Is there a missing atomic_set(&device_bound, 0) in the error paths and in
rvu_remove()?
You are right. The current implementation lacks
transactional rollback for device_bound in
rvu_probe() error paths, as well as the corresponding reset
in rvu_remove().

The inclusion of atomic_cmpxchg() here is a proactive sanity
check to enforce the hardware paradigm, as firmware instantiates
only a single Admin Function (AF) PCI device. Sashiko had raised many race
issues in previous version of the patch, assuming that multiple AF device
can be probed.

However, full
error-handling path hardening and proper resource cleanup for
the AF driver are currently incomplete.

To prevent scope creep in this fundamental enablement series, we
plan to address the comprehensive error-path rollback—including
proper atomic_set(&device_bound, 0) invocations on probe failure
and driver detachment—in a dedicated, subsequent hardening
patchset targeted for net-next.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help