Re: [PATCH v2 01/15] mailbox: Deprecate NULL mbox messages; Introduce mbox_ring_doorbell()
From: Bjorn Andersson <andersson@kernel.org>
Date: 2026-02-09 15:40:20
Also in:
arm-scmi, imx, linux-acpi, linux-arm-msm, linux-remoteproc, linux-tegra, lkml
On Sat, Feb 07, 2026 at 08:01:23PM -0800, Douglas Anderson wrote:
The way the mailbox core behaves when you pass a NULL `mssg` parameter to mbox_send_message() is a little questionable. Specifically, the mailbox core stores the currently active message directly in its `active_req` field. In at least two places it decides that if this field is `NULL` then there is no active request. That means if `mssg` is ever NULL it will cause the mailbox core to think is no active request. The two places where it does this are: 1. When a client calls mbox_send_message(), if `active_req` is NULL then it will call the mailbox controller to send the new message even if the mailbox controller hasn't yet called mbox_chan_txdone() on the previous (NULL) message. 2. The mailbox core will never call the client's `tx_done()` callback with a NULL message because `tx_tick()` returns early whenever the message is NULL. Though the above doesn't look like it was a conscious design choice, it does have the benefit of providing a simple way to assert an edge-triggered interrupt to the remote processor on the other side of the mailbox. Specifically: 1. Like a normal edge-triggered interrupt, if multiple edges arrive before the interrupt is Acked they are coalesced. 2. Like a normal edge-triggered interrupt, as long as the receiver (the remote processor in this case) "Ack"s the interrupt _before_ checking for work and the sender (the mailbox client in this case) posts the interrupt _after_ adding new work then we can always be certain that new work will be noticed. This assumes that the mailbox client and remote processor have some out-of-band way to communicate work and the mailbox is just being used as an interrupt. Doing a `git grep -A1 mbox_send_message | grep NULL` shows 14 hits in mainline today, but it's not 100% clear if all of those users are relying on the benefits/quirks of the existing behavior. Since the current NULL `mssg` behavior is a bit questionable but has some benefits, let's: 1. Deprecate the NULL behavior and print a warning. 2. Add a new mbox_ring_doorbell() function that is very similar to the existing NULL `mssg` case but a tad bit cleaner. The design of the new mbox_ring_doorbell() will be to maximize compatibility with the old NULL `mssg` behavior. Specifically: * We'll still pass NULL to the mailbox controller to indicate a doorbell. * Doorbells will not be queued and won't have txdone. * We'll call immediately into the mailbox controller when a doorbell is posted. With the above, any mailbox clients that don't mix doorbells and normal messages are intended to see no change in behavior when switching to the new API. Using the new API, which officiall documents that mbox_client_txdone() shouldn't be called for doorbells, does allow us to remove those calls. There are two differences in behavior between the old sending a NULL message and the new mbox_ring_doorbell(): 1. If the mailbox controller returned an error when trying to send a NULL message, the old NULL message could have ended up being queued up in the core's FIFO. Now we will just return the error. 2. If a client rings a doorbell while a non-doorbell message is in progress, previously NULL messages would have been "queued" in that case and now doorbells will be immediately posted. I'm hoping that nobody was relying on either of the two differences. In general holding NULL messages in the mailbox core's queue has odd behavior and is hard to reason about. Hopefully it's reasonable to assume nobody was doing this. As mentioned above, it should be noted that it's now documented that "txdone" shouldn't be called (by both mailbox drivers and clients) for doorbells. That being said, in most cases it won't hurt since the mailbox core will ignore the bogus "txdone". The only case where it's critical for a mailbox controller not to call "txdone" for a doorbell is when a mailbox channel mixes normal messages and doorbells and cares about the txdone callback. Specifically, when you ring a doorbell and immediately send a normal message, if the controller calls "txdone" for the doorbell it could look as if the normal message finished before it should have. This issue also would have happened with the old NULL `mssg`, though. Signed-off-by: Douglas Anderson <dianders@chromium.org>
I like how this cleans up the logic hacks, but perhaps even more so how it takes a step towards cleaning up the mailbox API when it comes to expectations between client and provider implementations. Reviewed-by: Bjorn Andersson <andersson@kernel.org> Regards, Bjorn