Thread (16 messages) 16 messages, 3 authors, 7d ago

Re: [PATCH v4 1/6] pds_core: add support for quiet devcmd failures

From: Simon Horman <horms@kernel.org>
Date: 2026-06-15 15:35:02

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
pds_core: add support for quiet devcmd failures

This commit introduces support for suppressing error messages during device
commands by adding a boolean parameter to wait and lock functions. It changes
pdsc_devcmd_wait and pdsc_devcmd_locked into internal __pdsc_devcmd_wait and
__pdsc_devcmd_locked variants that handle the new messaging flag. This
functionality is intended for negotiating the PDS_CORE_IDENTITY_VERSION_2.
quoted hunk ↗ jump to hunk
diff --git a/drivers/net/ethernet/amd/pds_core/dev.c b/drivers/net/ethernet/amd/pds_core/dev.c
index bded6b33289ce..dd9989cfe6b3f 100644
--- a/drivers/net/ethernet/amd/pds_core/dev.c
+++ b/drivers/net/ethernet/amd/pds_core/dev.c
[ ... ]
quoted hunk ↗ jump to hunk
-int pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd,
-		       union pds_core_dev_comp *comp, int max_seconds)
+static int __pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd,
+				union pds_core_dev_comp *comp, int max_seconds,
+				const bool do_msg)
 {
 	int err;
 
@@ -197,7 +199,7 @@ int pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd,
 
 	memcpy_toio(&pdsc->cmd_regs->cmd, cmd, sizeof(*cmd));
 	pdsc_devcmd_dbell(pdsc);
-	err = pdsc_devcmd_wait(pdsc, cmd->opcode, max_seconds);
+	err = __pdsc_devcmd_wait(pdsc, cmd->opcode, max_seconds, do_msg);
 
 	if ((err == -ENXIO || err == -ETIMEDOUT) && pdsc->wq)
 		queue_work(pdsc->wq, &pdsc->health_work);
[Severity: Critical]
This isn't a bug introduced by this patch, but it occurs in the function
modified here.

Can this queue_work() call trigger a use-after-free during driver removal?

Looking at pdsc_remove() in drivers/net/ethernet/amd/pds_core/main.c,
destroy_workqueue() is called but the pdsc->wq pointer is not cleared:

pdsc_remove() {
    ...
    if (pdsc->wq)
        destroy_workqueue(pdsc->wq);
    
    mutex_lock(&pdsc->config_lock);
    set_bit(PDSC_S_STOPPING_DRIVER, &pdsc->state);
    
    pdsc_stop(pdsc);
    pdsc_teardown(pdsc, PDSC_TEARDOWN_REMOVING);
    ...
}

Subsequently, pdsc_teardown() issues a RESET command to the firmware.
If the firmware is unresponsive and the command times out, we reach this
error path in __pdsc_devcmd_locked(). Since the pdsc->wq pointer was never
set to NULL, the check here succeeds, and it attempts to queue work to a
destroyed workqueue.

Should pdsc->wq be explicitly set to NULL immediately after being destroyed
in pdsc_remove() to prevent this?
 	else
 		memcpy_fromio(comp, &pdsc->cmd_regs->comp, sizeof(*comp));
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help