Re: [PATCH] Revert "mailbox/pcc: support mailbox management of the shared buffer"
From: Adam Young <hidden>
Date: 2025-09-30 22:12:52
Also in:
lkml
ACK to reverting. I will submit the changes again, and in separate patches. I will try to address our comments in line, so we have continuity of discussion. We can point to these messages from future reviews. On 9/30/25 05:37, Sudeep Holla wrote:
On Mon, Sep 29, 2025 at 01:11:23PM -0400, Adam Young wrote:quoted
I posted a patch that addresses a few of these issues. Here is a top level description of the isse The correct way to use the mailbox API would be to allocate a buffer for the message,write the message to that buffer, and pass it in to mbox_send_message. The abstraction is designed to then provide sequential access to the shared resource in order to send the messages in order. The existing PCC Mailbox implementation violated this abstraction. It requires each individual driver re-implement all of the sequential ordering to access the shared buffer.Please, let us keep the avoiding duplication as a separate topic atleast for the discussion. We can take care of it even before merging if you prefer that way but we need to explore what other drivers can use it. Otherwise it is not yet duplication right ?quoted
Why? Because they are all type 2 drivers, and the shared buffer is 64bits in length: 32bits for signature, 16 bits for command, 16 bits for status. It would be execessive to kmalloc a buffer of this size.Sure, if there is only and first driver needing large buffers, it is still not duplication yet. I agree it can be moved to PCC, but lets start with you client driver code first and then take it from there.quoted
This shows the shortcoming of the mailbox API. The mailbox API assumes that there is a large enough buffer passed in to only provide a void * pointer to the message. Since the value is small enough to fit into a single register, it the mailbox abstraction could provide an implementation that stored a union of a void * and word. With that change, all of the type 2 implementations could have their logic streamlined and moved into the PCC mailbox.No, it is left to the client driver interpretation as it clearly varies even within PCC type 1-5. Again, let us start with client driver code and see how to standardise later. I agree with PCC being standard, there is scope for avoiding duplication, but we will get to know that only if you first present it with the client driver code and we can then see how and what to make generic.quoted
However, I am providing an implementation for a type3/type4 based driver, and I do need the whole managmenet of the message buffer. IN addition, I know of at least one other subsystem (MPAM) that will benefit from a type3 implementation.Don't even go there. It is much bigger beast with all sorts of things to consider. Now that you have mentioned that, I am interested more to look at MPAM driver usage as well before merging anything as generic as I know MPAM is not so trivial. You pulled that topic into this, sorry 😉.
This actually got me to laugh. Nervously. I wonder what else is coming.
quoted
On 9/26/25 11:33, Sudeep Holla wrote:quoted
This reverts commit 5378bdf6a611a32500fccf13d14156f219bb0c85. Commit 5378bdf6a611 ("mailbox/pcc: support mailbox management of the shared buffer") attempted to introduce generic helpers for managing the PCC shared memory, but it largely duplicates functionality already provided by the mailbox core and leaves gaps: 1. TX preparation: The mailbox framework already supports this via ->tx_prepare callback for mailbox clients. The patch adds pcc_write_to_buffer() and expects clients to toggle pchan->chan.manage_writes, but no drivers set manage_writes, so pcc_write_to_buffer() has no users.tx prepare is insufficient, as it does not provide access to the type3 flags. IN addition, it forces the user to manage the buffer memory directly. WHile this is a necessary workaround for type 2 non extended memory regions, it does not make sense for a managed resource like the mailbox.Sorry if I am slow in understanding but I still struggle why tx_prepare won't work for you. Please don't jump to solve 2 problems at the same time as it just adds more confusion. Let us see if and how to make tx_prepare work for your driver. And then we can look at standardising it as a helper function that can be use in all the PCC mailbox client drivers if we can do that. You are just adding parallel and optional APIs just to get your driver working here. I am not against standardising to avoid duplication which is your concern(very valid) but doen't need to be solved by adding another API when the existing APIs already provides mechanism to do that. If you need information about the PCC type3/4, we can explore that as well.
I will submit a more detailed explaination when I resubmit that functionality. The short of it is that the Type3 Register information is in the PCCT, and that is not available outside mailbox/pcc.c For Type 2, there is an accessor function that exposes if the buffer is safe to write. For Type 3, you need the command completion fields and registers,...and there are two of them, one for setting and one for reading. Without this field, you cannot safely write to the shared buffer in tx_prepare. I am attempting to write it in such a way that it works for any extended memory driver. What I followed here is the PCC spec, not anything specific to the network driver I submitted to call it. So, yes, the alternative is to create a new accessor function that returns the cmd completion bit, but that would need to be called from inside of a spin lock.
quoted
You are correct that the manage_writes flag can be removed, but if (and only if) we limit the logic to type 3 or type 4 drivers. I have made that change in a follow on patch:OK, but I would like to start fresh reverting this patch.
I can see the value in that, and support the decision.
quoted
quoted
2. RX handling: Data reception is already delivered through mbox_chan_received_data() and client ->rx_callback. The patch adds an optional pchan->chan.rx_alloc, which again has no users and duplicates the existing path.The change needs to go in before there are users. The patch series that introduced this change requires this or a comparable callback mechanism.Not always necessary. Yes if it is agreed to get the user merged. But I am now questioning why you need it when you do have rx_callback.
RX callback is optional. For large buffers, we want to let the driver specify how to allocate the buffers. RX Callback will tell the driver that there is data, but would extend the pattern of requiring direct IO-memory access instead of using the message parameter.
quoted
However, the reviewers have shown that there is a race condition if the callback is provided to the PCC mailbox Channel, and thus I have provided a patch which moves this callback up to the Mailbox API.Sorry if I have missed it. Can you please point me to the race condition in question. I am interested to know more details.
The review in question was on Re: [PATCH net-next v28 1/1] mctp pcc: Implement MCTP over PCC Transport Jeremy's comment was: "Also: you're setting the client rx_callback *after*having set up the PCC channel. Won't this race with RX on the inbox?" And he is right. If you bring up a driver when the platform has messages ready to send, the alloc function needs to be available as soon as the mailbox is active. If not, there will be a race between message delivery and the assignment of the alloc function. That is why I am proposing a change to the mailbox API. I realize that this greatly increases the scope of the discussion. However, without providing the driver some way to specify how to allocate large buffers, message deliver becomes much more complicated. Essentially, the mailbox needs to hard code a message allocation scheme, and that means that a mechanism like PCC cannot handle different allocation schemes. Since the driver I am writing is a network driver, the right type of buffer is of type struct sk_buff. I would not want to make all PCC type 3 drivers use struct sk_buff, obviously. I wanted to limit the change to the PCC mailbox, but it does not appear to be possible without the race condition. The change to the mailbox api was submitted in a change titled [PATCH net-next v29 1/3] mailbox: add callback function for rx buffer allocation
quoted
This change, which is obviosuly not required when returning a single byte, is essential when dealing with larger buffers, such as those used by network drivers.I assume it can't be beyond the shmem area anyways. That can be read from the rx_callback. Again I haven't understood your reasoning as why the allocation and copy can't be part of rx_callback.
That is correct. It is only the shared memory region. Yes, it can be read from rx_callback. But the protocol is more complex than just reading the buffer, and I was trying to write it in a reusable fashion and inaccordance with the intention of the mailbox API.
quoted
quoted
3. Completion handling: While adding last_tx_done is directionally useful, the implementation only covers Type 3/4 and fails to handle the absence of a command_complete register, so it is incomplete for other types.Applying it to type 2 and earlier would require a huge life of rewriting code that is both multi architecture (CPPC) and on esoteric hardware (XGene) and thus very hard to test.True but you have changed the generic code which could break Type1/2 PCC. I am not sure if it is tested yet.
I did basic testing: CPPC still works correctly on the systems that this code runs on.
quoted
While those drivers should make better use of the mailbox mechanism, stopping the type 3 drivers from using this approach stops an effort to provide a common implementation base. That should happen in future patches, as part of reqorking the type 2 drivers.No you need to take care to apply your changes only for Type3/4 so that Type1/2 is unaffected. You can expect to break and someone else to fix the breakage later.
Agreed. The code was written to only affect the path for Type3/4 interfaces. The managed_writes flag was originally for just that reason: make the code explicitly opt in. However, the write (outgoing) message flow is only changed for type 3, and thus managed_writes is not needed. I would suggest a standard that the mssg is NULL for cases where the driver is not going to actually send the data via mbox_send_message and instead is going to write the buffer directly. The write_response path was only taken if the rx_alloc callback is set, and cannot happen for a pre-existing driver that does not set that callback.
quoted
Command Complete is part of the PCC specification for type 3 drivers.Agreed, that's not the argument. The check is done unconditionally. I will send the patch once we agree to revert this change and start fresh. And each feature handled separately instead of mixing 3 different things in one patch.
The check happens in pcc_write_to_buffer and only if (!pchan->chan.manage_writes) However, we can and should make that check be that the channel is a type3/4 instead. I did that in another patch, which I will replicate post revert.
quoted
quoted
Given the duplication and incomplete coverage, revert this change. Any new requirements should be addressed in focused follow-ups rather than bundling multiple behavioral changes together.I am willing to break up the previous work into multiple steps, provided the above arguments you provided are not going to prevent them from getting merged. Type 3/4 drivers can and should make use of the Mailbox abstraction. Doing so can lay the ground work for making the type 2 drivers share a common implementation of the shared buffer management.Sure. Lets revert this patch and start discussing your individual requirements in individual patches and check why tx_prepare and rx_callback can't work for you. Please share the client driver code changes you tried when checking tx_prepare and rx_callback as well so that we can see why it can't work.
Thanks for your attention and feedback. Much appreciated.