Re: [PATCH net-next V4 4/6] devlink: Apply eswitch mode boot defaults

From: Mark Bloch <mbloch@nvidia.com>
Date: 2026-07-03 18:27:35
Also in: linux-doc, linux-rdma


On 02/07/2026 10:52, Jiri Pirko wrote:

Wed, Jul 01, 2026 at 07:42:57PM +0200, mbloch@nvidia.com wrote:

quoted


On 01/07/2026 17:09, Jiri Pirko wrote:

quoted

Wed, Jul 01, 2026 at 02:57:21PM +0200, mbloch@nvidia.com wrote:

quoted


On 01/07/2026 12:48, Jiri Pirko wrote:

quoted

Mon, Jun 29, 2026 at 08:20:59PM +0200, mbloch@nvidia.com wrote:

quoted

Apply parsed devlink_eswitch_mode= defaults after devlink registration
and after successful reload.

devl_register() may still be called before the device is ready for an

How so? I would assume that driver calls devl_register only after
everything is up and running and ready. If not, isn't it a bug?

You would think so :)

Some drivers, mlx5 included, call devl_register() while holding the
devlink instance lock and then finish setting up state before releasing
the lock.

In v3 I tried to enforce exactly that model, move devl_register() to
be the last thing the driver does. Jakub pushed back on making that a
general rule. So in v4 I changed the approach. devl_register() only
schedules the work, and the actual eswitch mode change can run only
after the driver releases the devlink lock.

Wouldn't it make sense to use a completion instead of loop-reschedule of
delayed work?

Just to make sure I understand the suggestion, this would mean that the
work waits until the devlink lock holder drops the lock, and devl_unlock()
would signal it, something like:

void devl_unlock(struct devlink *devlink)
{
	ool complete_apply = devlink->default_esw_mode_apply_pending;

	mutex_unlock(&devlink->lock);

	if (complete_apply)
		complete(&devlink->default_esw_mode_apply_ready);
}

That would avoid the retry loop, but it also means the queued work 
sleeps until the driver drops devl_lock. It does keep one worker
blocked per pending instance and adds this default-esw-mode signalling to
the generic devl_unlock() path.

The delayed retry was meant to avoid a sleeping worker and keep the
instances independent. If one devlink instance is still locked, we just
try it again later while other instances can progress.

If you prefer the completion approach I can switch to it, but I don't see
it as simpler overall.

Yeah, I don't have preference. I was just wondering. Feel free to leave
it as is.

Maybe, instead of "complete", you can schedule with "0" delay in
devl_unlock? Well, it does not really need to be delayed work, right?
The only single schedule may be done from devl_unlock. That would help
to eliminate the rescheduling. Am I missing something?

Yeah, that can work.

The only part I don't really like is adding default-esw-mode specific
logic to devl_unlock(). But if you are fine with that, I can switch to
this approach.

There is still a small race between mutex_unlock() and queue_work(), where
someone else can take devl_lock() first. So the worker may still wait on
the lock, but the window should be small and we get rid of the delayed
retry loop.

Mark

quoted

Mark

quoted

Mark

quoted

eswitch mode change, so keep a per-devlink delayed work item and pending
flag for the registration path. Registration queues the work, and the
worker tries to take the devlink instance lock.

If the lock is busy, the worker requeues itself with a delay.

For successful reloads that performed DRIVER_REINIT, devlink_reload()
already holds the devlink instance lock and the driver has completed
reload_up(). Clear pending work and apply the default directly from the
reload path instead of queueing work.

If a user sets eswitch mode through netlink before the pending
registration work runs, clear the pending flag so the queued default does
not override that user request. Cancel pending default apply work when
freeing the devlink instance.

These AI generated code descriptive messages are generally not very
useful :(

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help