Thread (51 messages) 51 messages, 7 authors, 2016-07-19

[PATCH V2 05/10] firmware: tegra: add BPMP support

From: Sivaram Nair <hidden>
Date: 2016-07-08 20:25:57
Also in: linux-devicetree, linux-tegra, lkml

On Thu, Jul 07, 2016 at 07:18:34PM +0900, Alexandre Courbot wrote:
On Thu, Jul 7, 2016 at 5:17 PM, Joseph Lo [off-list ref] wrote:
quoted
On 07/06/2016 07:39 PM, Alexandre Courbot wrote:
quoted
Sorry, I will probably need to do several passes on this one to
understand everything, but here is what I can say after a first look:

On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo [off-list ref] wrote:
quoted
The Tegra BPMP (Boot and Power Management Processor) is designed for the
booting process handling, offloading the power management tasks and
some system control services from the CPU. It can be clock, DVFS,
thermal/EDP, power gating operation and system suspend/resume handling.
So the CPU and the drivers of these modules can base on the service that
the BPMP firmware driver provided to signal the event for the specific PM
action to BPMP and receive the status update from BPMP.

Comparing to the ARM SCPI, the service provided by BPMP is message-based
communication but not method-based. The BPMP firmware driver provides the
send/receive service for the users, when the user concerns the response
time. If the user needs to get the event or update from the firmware, it
can request the MRQ service as well. The user needs to take care of the
message format, which we call BPMP ABI.

The BPMP ABI defines the message format for different modules or usages.
For example, the clock operation needs an MRQ service code called
MRQ_CLK with specific message format which includes different sub
commands for various clock operations. This is the message format that
BPMP can recognize.

So the user needs two things to initiate IPC between BPMP. Get the
service from the bpmp_ops structure and maintain the message format as
the BPMP ABI defined.

Based-on-the-work-by:
Sivaram Nair [off-list ref]

Signed-off-by: Joseph Lo <redacted>
---
Changes in V2:
- None
---
  drivers/firmware/tegra/Kconfig  |   12 +
  drivers/firmware/tegra/Makefile |    1 +
  drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
  include/soc/tegra/bpmp.h        |   29 +
  include/soc/tegra/bpmp_abi.h    | 1601
+++++++++++++++++++++++++++++++++++++++
  5 files changed, 2356 insertions(+)
  create mode 100644 drivers/firmware/tegra/bpmp.c
  create mode 100644 include/soc/tegra/bpmp.h
  create mode 100644 include/soc/tegra/bpmp_abi.h
diff --git a/drivers/firmware/tegra/Kconfig
b/drivers/firmware/tegra/Kconfig
index 1fa3e4e136a5..ff2730d5c468 100644
--- a/drivers/firmware/tegra/Kconfig
+++ b/drivers/firmware/tegra/Kconfig
@@ -10,4 +10,16 @@ config TEGRA_IVC
           keeps the content is synchronization between host CPU and
remote
           processors.

+config TEGRA_BPMP
+       bool "Tegra BPMP driver"
+       depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
+       help
+         BPMP (Boot and Power Management Processor) is designed to
off-loading

s/off-loading/off-load
quoted
+         the PM functions which include clock/DVFS/thermal/power from
the CPU.
+         It needs HSP as the HW synchronization and notification module
and
+         IVC module as the message communication protocol.
+
+         This driver manages the IPC interface between host CPU and the
+         firmware running on BPMP.
+
  endmenu
diff --git a/drivers/firmware/tegra/Makefile
b/drivers/firmware/tegra/Makefile
index 92e2153e8173..e34a2f79e1ad 100644
--- a/drivers/firmware/tegra/Makefile
+++ b/drivers/firmware/tegra/Makefile
@@ -1 +1,2 @@
+obj-$(CONFIG_TEGRA_BPMP)       += bpmp.o
  obj-$(CONFIG_TEGRA_IVC)                += ivc.o
diff --git a/drivers/firmware/tegra/bpmp.c
b/drivers/firmware/tegra/bpmp.c
new file mode 100644
index 000000000000..24fda626610e
--- /dev/null
+++ b/drivers/firmware/tegra/bpmp.c
@@ -0,0 +1,713 @@
+/*
+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for
+ * more details.
+ */
+
+#include <linux/mailbox_client.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/semaphore.h>
+
+#include <soc/tegra/bpmp.h>
+#include <soc/tegra/bpmp_abi.h>
+#include <soc/tegra/ivc.h>
+
+#define BPMP_MSG_SZ            128
+#define BPMP_MSG_DATA_SZ       120
+
+#define __MRQ_ATTRS            0xff000000
+#define __MRQ_INDEX(id)                ((id) & ~__MRQ_ATTRS)
+
+#define DO_ACK                 BIT(0)
+#define RING_DOORBELL          BIT(1)
+
+struct tegra_bpmp_soc_data {
+       u32 ch_index;           /* channel index */
+       u32 thread_ch_index;    /* thread channel index */
+       u32 cpu_rx_ch_index;    /* CPU Rx channel index */
+       u32 nr_ch;              /* number of total channels */
+       u32 nr_thread_ch;       /* number of thread channels */
+       u32 ch_timeout;         /* channel timeout */
+       u32 thread_ch_timeout;  /* thread channel timeout */
+};

With just these comments it is not clear what everything in this
structure does. Maybe a file-level comment explaining how BPMP
basically works and what the different channels are allocated to would
help understanding the code.

We have two kinds of TX channels (channel & thread channel above) for the
BPMP clients (clock, thermal, reset, power mgmt control, etc.) to use.

The channel means an atomic channel that could be used when the client needs
the response immediately. e.g. setting clock rate, re-parent the clock
source. Each CPUs have it's own atomic for the usage. The client can acquire
one of them, and the ch_index means the first channel they are able to use
in the channel array.

The response of thread channel can be postponed later. And the client allows
getting the response after BPMP finished the service and response to them by
IRQ. The thread_ch_index means the same the first  channel that the clients
are available to use.

And the CPU RX channel is designed for the client to register some specific
services (We call MRQ in the bpmp_abi.) listen to some update from the BPMP
firmware.

Because we might have different numbers of these channels, using this
structure as the bpmp_soc_data to get different configuration according to
different SoC.
Thanks, that clarifies things. This explanation deserves to in the C
file as well IMHO.

So IIUC the first 13 channels (6 bound to a specific CPU core and 7
threaded, allocated dynamically) are all used to initiate a
communication to the BPMP, while the cpu_rx channel is used as a sort
of IRQ (hence the name MRQ). Is this correct? This would be valuable
to state too. Maybe cpu_rx_ch_index can even be renamed to something
like mrq_ch_index to stress that fact.

A few additional comments follow below as I did a second pass on the code.
quoted
quoted
quoted
+
+struct channel_info {
+       u32 tch_free;
+       u32 tch_to_complete;
+       struct semaphore tch_sem;
+};
+
+struct mb_data {
+       s32 code;
+       s32 flags;
+       u8 data[BPMP_MSG_DATA_SZ];
+} __packed;
+
+struct channel_data {
+       struct mb_data *ib;
+       struct mb_data *ob;
+};
+
+struct mrq {
+       struct list_head list;
+       u32 mrq_code;
+       bpmp_mrq_handler handler;
+       void *data;
+};
+
+struct tegra_bpmp {
+       struct device *dev;
+       const struct tegra_bpmp_soc_data *soc_data;
+       void __iomem *tx_base;
+       void __iomem *rx_base;
+       struct mbox_client cl;
+       struct mbox_chan *chan;
+       struct ivc *ivc_channels;
+       struct channel_data *ch_area;
+       struct channel_info ch_info;
+       struct completion *ch_completion;
+       struct list_head mrq_list;
+       struct tegra_bpmp_ops *ops;
+       spinlock_t lock;
+       bool init_done;
+};
+
+static struct tegra_bpmp *bpmp;

static? Ok, we only need one... for now. How about a private member in
your ivc structure that allows you to retrieve the bpmp and going
dynamic? This will require an extra argument in many functions, but is
cleaner design IMHO.

IVC is designed as a generic IPC protocol among different modules (We have
not introduced some other usages of the IVC right now.). Maybe don't churn
some other stuff into IVC is better.
Anything is fine if you can get rid of that static.
quoted
quoted
quoted
+
+static int bpmp_get_thread_ch(int idx)
+{
+       return bpmp->soc_data->thread_ch_index + idx;
+}
+
+static int bpmp_get_thread_ch_index(int ch)
+{
+       if (ch < bpmp->soc_data->thread_ch_index ||
+           ch >= bpmp->soc_data->cpu_rx_ch_index)

Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index +
bpmp->soc_data->nr_thread_ch?

Either rx_ch_index indicates the upper bound of the threaded channels,
and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or
it can be anywhere else and you should use the correct member.

According the to the table below, we have 14 channels.
atomic ch: 0 ~ 5, 6 chanls
thread ch: 6 ~ 17, 7 chanls
CPU RX ch: 13 ~ 14, 2 chanls
Or, did you mean

thread ch: 6 -> 12 
cpu rx ch: 13 (1 channel)
quoted
+static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
+       .ch_index = 0,
+       .thread_ch_index = 6,
+       .cpu_rx_ch_index = 13,
+       .nr_ch = 14,
+       .nr_thread_ch = 7,
+       .ch_timeout = 60 * USEC_PER_SEC,
+       .thread_ch_timeout = 600 * USEC_PER_SEC,
+};

We use the index to check channel violation and nr_thread_ch for other usage
to avoid redundant channel number calculation elsewhere.
Sorry, my comment had a mistake. I meant that

          ch >= bpmp->soc_data->cpu_rx_ch_index

Should maybe be

          ch >= bpmp->soc_data->cpu_rx_ch_index + bpmp->soc_data->nr_thread_ch
Or did you mean
	ch >= bpmp->soc_data->thread_ch_index + bpmp->soc_data->nr_thread_ch ?
According to the description you gave of these fields, there is no
guarantee that cpu_rx_ch_index will always be the first channel after
the threaded channels.
I second Alex's concerns. It would better not to depend on the
adjacency of the channels. Also I think this data should come from the
device tree.
quoted
quoted
quoted
+               return -1;
+       return ch - bpmp->soc_data->thread_ch_index;
+}
+
+static int bpmp_get_ob_channel(void)
+{
+       return smp_processor_id() + bpmp->soc_data->ch_index;
+}
+
+static struct completion *bpmp_get_completion_obj(int ch)
+{
+       int i = bpmp_get_thread_ch_index(ch);
+
+       return i < 0 ? NULL : bpmp->ch_completion + i;
+}
+
+static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int
ib_sz)
+{
+       return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
+              ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
+              (!ob_sz || ob_data) && (!ib_sz || ib_data);
+}
+
+static bool bpmp_master_acked(int ch)
+{
+       struct ivc *ivc_chan;
+       void *frame;
+       bool ready;
+
+       ivc_chan = bpmp->ivc_channels + ch;
+       frame = tegra_ivc_read_get_next_frame(ivc_chan);
+       ready = !IS_ERR_OR_NULL(frame);
+       bpmp->ch_area[ch].ib = ready ? frame : NULL;
+
+       return ready;
+}
+
+static int bpmp_wait_ack(int ch)
Shouldn't this be bpmp_wait_master_ack ? Looking at the two next
functions makes me think it should (or bpmp_wait_master_free should be
renamed to bpmp_wait_free).
quoted
quoted
quoted
+{
+       ktime_t t;
+
+       t = ns_to_ktime(local_clock());
+
+       do {
+               if (bpmp_master_acked(ch))
+                       return 0;
+       } while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
+                bpmp->soc_data->ch_timeout);
+
+       return -ETIMEDOUT;
+}
+
+static bool bpmp_master_free(int ch)
+{
+       struct ivc *ivc_chan;
+       void *frame;
+       bool ready;
+
+       ivc_chan = bpmp->ivc_channels + ch;
+       frame = tegra_ivc_write_get_next_frame(ivc_chan);
+       ready = !IS_ERR_OR_NULL(frame);
+       bpmp->ch_area[ch].ob = ready ? frame : NULL;
+
+       return ready;
+}
+
+static int bpmp_wait_master_free(int ch)
+{
+       ktime_t t;
+
+       t = ns_to_ktime(local_clock());
+
+       do {
+               if (bpmp_master_free(ch))
+                       return 0;
+       } while (ktime_us_delta(ns_to_ktime(local_clock()), t)
+                < bpmp->soc_data->ch_timeout);
+
+       return -ETIMEDOUT;
+}
+
+static int __read_ch(int ch, void *data, int sz)
+{
+       struct ivc *ivc_chan;
+       struct mb_data *p;
+
+       ivc_chan = bpmp->ivc_channels + ch;
+       p = bpmp->ch_area[ch].ib;
+       if (data)
+               memcpy_fromio(data, p->data, sz);
+
+       return tegra_ivc_read_advance(ivc_chan);
+}
+
+static int bpmp_read_ch(int ch, void *data, int sz)
bpmp_read_threaded_ch maybe? we have bpmp_write_threaded_ch below, as
this function is clearly dealing with threaded channels only.
quoted
quoted
quoted
+{
+       unsigned long flags;
+       int i, ret;
+
+       i = bpmp_get_thread_ch_index(ch);

i is not a very good name for this variable.
Also note that bpmp_get_thread_ch_index() can return -1, this case is
not handled.
Okay, will fix this.

quoted
quoted
+
+       spin_lock_irqsave(&bpmp->lock, flags);
+       ret = __read_ch(ch, data, sz);
+       bpmp->ch_info.tch_free |= (1 << i);
+       spin_unlock_irqrestore(&bpmp->lock, flags);
+
+       up(&bpmp->ch_info.tch_sem);
+
+       return ret;
+}
+
+static int __write_ch(int ch, int mrq_code, int flags, void *data, int
sz)
+{
+       struct ivc *ivc_chan;
+       struct mb_data *p;
+
+       ivc_chan = bpmp->ivc_channels + ch;
+       p = bpmp->ch_area[ch].ob;
+
+       p->code = mrq_code;
+       p->flags = flags;
+       if (data)
+               memcpy_toio(p->data, data, sz);
+
+       return tegra_ivc_write_advance(ivc_chan);
+}
+
+static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int
sz)
+{
+       unsigned long flags;
+       int ret, i;
+
+       ret = down_timeout(&bpmp->ch_info.tch_sem,
+
usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
+       if (ret)
+               return ret;
+
+       spin_lock_irqsave(&bpmp->lock, flags);
+
+       i = __ffs(bpmp->ch_info.tch_free);
+       *ch = bpmp_get_thread_ch(i);
+       ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
+       if (!ret) {
Style nit: I prefer to make the error case the exception, and normal
runtime the norm. This is where a goto statement can actually make
your code easier to follow. Have an err: label before the spin_unlock,
and jump to it if ret != 0. Then you can have the next three lines at
the lower indentation level, and not looking like as if they were an
error themselves.

Or if you really don't like the goto, check for ret != 0 and do the
spin_unlock and return in that block.
quoted
quoted
quoted
+               bpmp->ch_info.tch_free &= ~(1 << i);
+               __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data,
sz);
+               bpmp->ch_info.tch_to_complete |= (1 << *ch);
+       }
+
+       spin_unlock_irqrestore(&bpmp->lock, flags);
+
+       return ret;
+}
+
+static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data,
int sz)
+{
+       int ret;
+
+       ret = bpmp_wait_master_free(ch);
+       if (ret)
+               return ret;
+
+       return __write_ch(ch, mrq_code, flags, data, sz);
+}
+
+static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int
ob_sz,
+                                   void *ib_data, int ib_sz)
+{
+       int ch, ret;
+
+       if (WARN_ON(!irqs_disabled()))
+               return -EPERM;
+
+       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
+               return -EINVAL;
+
+       if (!bpmp->init_done)
+               return -ENODEV;
+
+       ch = bpmp_get_ob_channel();
+       ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
+       if (ret)
+               return ret;
+
+       ret = mbox_send_message(bpmp->chan, NULL);
+       if (ret < 0)
+               return ret;
+       mbox_client_txdone(bpmp->chan, 0);
+
+       ret = bpmp_wait_ack(ch);
+       if (ret)
+               return ret;
+
+       return __read_ch(ch, ib_data, ib_sz);
+}
+
+static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
+                            void *ib_data, int ib_sz)
+{
+       struct completion *comp_obj;
+       unsigned long timeout;
+       int ch, ret;
+
+       if (WARN_ON(irqs_disabled()))
+               return -EPERM;
+
+       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
+               return -EINVAL;
+
+       if (!bpmp->init_done)
+               return -ENODEV;
+
+       ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
+       if (ret)
+               return ret;
+
+       ret = mbox_send_message(bpmp->chan, NULL);
+       if (ret < 0)
+               return ret;
+       mbox_client_txdone(bpmp->chan, 0);
+
+       comp_obj = bpmp_get_completion_obj(ch);
+       timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
+       if (!wait_for_completion_timeout(comp_obj, timeout))
+               return -ETIMEDOUT;
+
+       return bpmp_read_ch(ch, ib_data, ib_sz);
+}
+
+static struct mrq *bpmp_find_mrq(u32 mrq_code)
+{
+       struct mrq *mrq;
+
+       list_for_each_entry(mrq, &bpmp->mrq_list, list) {
+               if (mrq_code == mrq->mrq_code)
+                       return mrq;
+       }
+
+       return NULL;
+}
+
+static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
+{
+       int flags = bpmp->ch_area[ch].ib->flags;
+       struct ivc *ivc_chan;
+       struct mb_data *frame;
+       int ret;
+
+       if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
+               return;
+
+       ivc_chan = bpmp->ivc_channels + ch;
+       ret = tegra_ivc_read_advance(ivc_chan);
+       WARN_ON(ret);
+
+       if (!(flags & DO_ACK))
+               return;
+
+       frame = tegra_ivc_write_get_next_frame(ivc_chan);
+       if (IS_ERR_OR_NULL(frame)) {
+               WARN_ON(1);
+               return;
+       }
+
+       frame->code = code;
+       if (data != NULL)
+               memcpy_toio(frame->data, data, sz);
+       ret = tegra_ivc_write_advance(ivc_chan);
+       WARN_ON(ret);
+
+       if (flags & RING_DOORBELL) {
+               ret = mbox_send_message(bpmp->chan, NULL);
+               if (ret < 0) {
+                       WARN_ON(1);
+                       return;
+               }
+               mbox_client_txdone(bpmp->chan, 0);
+       }
+}
+
+static void bpmp_mail_return(int ch, int ret_code, int val)
+{
+       bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
+}
+
+static void bpmp_handle_mrq(int mrq_code, int ch)
+{
+       struct mrq *mrq;
+
+       spin_lock(&bpmp->lock);
+
+       mrq = bpmp_find_mrq(mrq_code);
+       if (!mrq) {
+               spin_unlock(&bpmp->lock);
+               bpmp_mail_return(ch, -EINVAL, 0);
+               return;
+       }
+
+       mrq->handler(mrq_code, mrq->data, ch);
+
+       spin_unlock(&bpmp->lock);
+}
+
+static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void
*data)
+{
+       struct mrq *mrq;
+       unsigned long flags;
+
+       if (!handler)
+               return -EINVAL;
+
+       mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
+       if (!mrq)
+               return -ENOMEM;
+
+       spin_lock_irqsave(&bpmp->lock, flags);
+
+       mrq->mrq_code = __MRQ_INDEX(mrq_code);
+       mrq->handler = handler;
+       mrq->data = data;
+       list_add(&mrq->list, &bpmp->mrq_list);
+
+       spin_unlock_irqrestore(&bpmp->lock, flags);
+
+       return 0;
+}
+
+static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
+{
+       int challenge;
+       int reply;
+
+       challenge = *(int *)bpmp->ch_area[ch].ib->data;
+       reply = challenge << (smp_processor_id() + 1);
+       bpmp_mail_return(ch, 0, reply);
+}
+
+static int bpmp_mailman_init(void)
+{
+       return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
+}
+
+static int bpmp_ping(void)
+{
+       unsigned long flags;
+       ktime_t t;
+       int challenge = 1;

Mmmm, shouldn't use a mrq_ping_request instead of an parameter which
size may vary depending on the architecture? On a 64-bit big endian
architecture, your messages would be corrupted.

Clarify one thig first. The mrq_ping_request and mrq_handle_ping above are
used for the ping form BPMP to CPU. Like I said above, it's among CPU RX
channel to get some information from BPMP firmware.
Ok, so mrq_handle_ping *should* use these data structures at the very least.
quoted
Here is the ping request from CPU to BPMP to make sure we can IPC with BPMP
during the probe stage.

About the endian issue, I think we don't consider that in the message format
right now. So I think we only support little endian for the IPC messages
right now.
Any code in the kernel should function correctly regardless of
endianness. And the problem is not so much with endianness as it is
with the operand size - is the BPMP expecting a 64-bit challenge here?
Considering that the equivalent MRQ uses a 32-bit integer, I'd bet
not. So please use u32/u64 as needed as well as cpu_to_leXX (and
leXX_to_cpu for the opposite) to make your code solid.
I second this.
I understand that you don't want to use the MRQ structures because we
are not handling a MRQ here, but if they are relevant I think this
would still be safer that constructing messages from scalar data. That
or we should introduce a proper structure for these messages, but here
using the MRQ structure looks acceptable to me. Maybe they should not
be named MRQ at all, but that's not for us to decide.
We should be using the mrq request structures from the ABI header.
quoted
quoted
quoted
+       int reply = 0;

And this should probably be a mrq_ping_response. These remarks may
also apply to bpmp_mrq_handle_ping().
That is for receiving the ping request from BPMP.
quoted
quoted
+       int ret;
+
+       t = ktime_get();
+       local_irq_save(flags);
+       ret = bpmp_send_receive_atomic(MRQ_PING, &challenge,
sizeof(challenge),
+                                      &reply, sizeof(reply));
+       local_irq_restore(flags);
+       t = ktime_sub(ktime_get(), t);
+
+       if (!ret)
+               dev_info(bpmp->dev,
+                        "ping ok: challenge: %d, reply: %d, time:
%lld\n",
+                        challenge, reply, ktime_to_us(t));
+
+       return ret;
+}
+
+static int bpmp_get_fwtag(void)
+{
+       unsigned long flags;
+       void *vaddr;
+       dma_addr_t paddr;
+       u32 addr;

Here also we should use a mrq_query_tag_request.
The is one-way request from CPU to BPMP. So we don't request an MRQ for
that.
It is not clear to me what you mean by 'one-way request' here. We are
sending a request to get the tag and we are getting the tag back via the
same message's response. Anyway, we should be using the 'struct
mrq_query_tag_request' here to be consistent.
quoted
quoted
quoted
+       int ret;
+
+       vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
+                                  GFP_KERNEL);

dma_addr_t may be 64 bit here, and you may get an address higher than
the 32 bits allowed by mrq_query_tag_request! I guess you want to add
GFP_DMA32 as flag to your call to dma_alloc_coherent.
BPMP should able to handle the address above 32 bits, but I am not sure does
it configure to support that?
Either way, since this specific MRQ takes in only 32 bit address, I
think we should follow Alex's recommendation to use the GFP_DMA32 flag.
If the message you pass only contains a 32-bit address, then I'm
afraid the protocol is the limiting factor here until it is updated.

Can't wait for the day when we will have to manage several versions of
this protocol! >_<
If we need to pass a larger-than-32 bit address for this MRQ (or for
anything that takes in a 32-bit address now), the agreed upon process is
to define a new MRQ (i.e one with a different integer id) that takes in
new address type (and deprecating the 32-bit MRQ version). 
quoted
Will fix this.

quoted
quoted
+       if (!vaddr)
+               return -ENOMEM;
+       addr = paddr;
+
+       local_irq_save(flags);
+       ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr,
sizeof(addr),
+                                      NULL, 0);
+       local_irq_restore(flags);
+
+       if (!ret)
+               dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
+
+       dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
+
+       return ret;
+}
+
+static void bpmp_signal_thread(int ch)
+{
+       int flags = bpmp->ch_area[ch].ob->flags;
+       struct completion *comp_obj;
+
+       if (!(flags & RING_DOORBELL))
+               return;
+
+       comp_obj = bpmp_get_completion_obj(ch);
+       if (!comp_obj) {
+               WARN_ON(1);
+               return;
+       }
+
+       complete(comp_obj);
+}
+
+static void bpmp_handle_rx(struct mbox_client *cl, void *data)
+{
+       int i, rx_ch;
+
+       rx_ch = bpmp->soc_data->cpu_rx_ch_index;
+
+       if (bpmp_master_acked(rx_ch))
+               bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
+
+       spin_lock(&bpmp->lock);
+
+       for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
+                       bpmp->ch_info.tch_to_complete; i++) {
for_each_set_bit(bpmp->ch_info.tch_to_complete, &i,
bpmp->soc_data->nr_thread_ch) ?

This will reduce the number of iterations and you won't have to do the
bpmp->ch_info.tch_to_complete & (1 << ch) check below.
quoted
quoted
quoted
+               int ch = bpmp_get_thread_ch(i);
+
+               if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
+                   bpmp_master_acked(ch)) {
+                       bpmp->ch_info.tch_to_complete &= ~(1 << ch);
+                       bpmp_signal_thread(ch);
+               }
+       }
+
+       spin_unlock(&bpmp->lock);
+}
+
+static void bpmp_ivc_notify(struct ivc *ivc)
+{
+       int ret;
+
+       ret = mbox_send_message(bpmp->chan, NULL);
+       if (ret < 0)
+               return;
+
+       mbox_send_message(bpmp->chan, NULL);

Why the second call to mbox_send_message? May to useful to add a
comment explaining it.
Ah!! It should be mbox_client_txdone(). Good catch.
That makes more sense. :) But did this code work even with that typo?
It should have --- mbox_client_txdone() essentilly does nothing now.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help