[PATCH v14 3/5] tee: add OP-TEE driver

From: arnd@arndb.de (Arnd Bergmann)
Date: 2017-01-20 16:59:02
Also in: linux-devicetree, lkml

On Thursday, January 19, 2017 3:56:23 PM CET Jens Wiklander wrote:

On Wed, Jan 18, 2017 at 05:28:17PM +0100, Arnd Bergmann wrote:

quoted

On Wednesday, January 18, 2017 1:58:14 PM CET Jens Wiklander wrote:

quoted

+static void optee_cq_complete_one(struct optee_call_queue *cq)
+{
+	struct optee_call_waiter *w;
+
+	list_for_each_entry(w, &cq->waiters, list_node) {
+		if (!w->completed) {
+			complete(&w->c);
+			w->completed = true;
+			break;
+		}
+	}
+}
+
+static void optee_cq_wait_final(struct optee_call_queue *cq,
+				struct optee_call_waiter *w)
+{
+	mutex_lock(&cq->mutex);
+
+	/* Get out of the list */
+	list_del(&w->list_node);
+
+	optee_cq_complete_one(cq);
+	/*
+	 * If we're completed we've got a completion that some other task
+	 * could have used instead.
+	 */
+	if (w->completed)
+		optee_cq_complete_one(cq);
+
+	mutex_unlock(&cq->mutex);
+}

This deserves some more comments: the function name suggests that you are
waiting for a specific optee_call_waiter, but then it calls
optee_cq_complete_one(), which unconditionally completes the first
incomplete completion and it never waits.

OK, I'm updating these functions with more comments. The purpose of
these functions is to deal with resource shortage in secure world.
There's a limit on how many threads that execute concurrently in secure
world. optee_cq_wait_for_completion() waits for any task returning from
a call to secure world, optee_cq_complete_one() doesn't care who's
completed as long as tasks aren't stuck when there's available resources
in secure world.

Ok, that is indeed not clear from the code but your explanation makes
perfect sense, thanks!

quoted

+static int __init optee_driver_init(void)
+{
+	struct device_node *node;
+
+	/*
+	 * Preferred path is /firmware/optee, but it's the matching that
+	 * matters.
+	 */
+	for_each_matching_node(node, optee_match)
+		of_platform_device_create(node, NULL, NULL);
+
+	return platform_driver_register(&optee_driver);
+}
+module_init(optee_driver_init);
+
+static void __exit optee_driver_exit(void)
+{
+	platform_driver_unregister(&optee_driver);
+}
+module_exit(optee_driver_exit);

What is the platform driver good for if the same module has to create the
platform devices itself?

The platform device(s) are created here because the optee node is below
"/firmware" instead of the root where it would have had the platform
device created automatically.

I think it's useful to be able to unload the module, the early reviews
of this patch set was much focused around that. Regardless I'll need
some device as parent for the devices created during optee_probe() and
using a platform device for that seems natural.

I'd rather keep the platform driver. Perhaps some variant of the pattern
in qcom_scm_init() (drivers/firmware/qcom_scm.c) is useful, except that
I need to find out what to do about the life cycle of the objects
created with of_platform_populate().

My point was that I don't think we need devices here at all. It's different
when you talk to external hardware that has register resource etc that
can be best abstracted as a real device, but for other firmware features
we don't normally add one.

Module unloading can also be done without the device.

quoted

I'd just skip it and do

	for_each_matching_node(node, optee_match)
		optee_probe(node);

I also suspect that module unloading is broken here if you don't clean
up the platform devices in the end, so you should already remove the
exit function to prevent unloading.

Does the platform devices really need cleaning? I mean
of_platform_default_populate_init() creates a bunch of platform devices
which are just left there even if unused. Here we're doing the same
thing except that we're doing it for a specific node in the DT.

I think it will work if you don't clean them up, but it feels wrong
to have a loadable module that creates devices when loaded but doesn't
remove them when unloaded.

This could be done differently by having the device creation done in
one driver and the the user of that device in another driver, but I
think just killing off the device achieves the same in a simpler way.

quoted

+/*
+ * Get revision of Trusted OS.
+ *
+ * Used by non-secure world to figure out which version of the Trusted OS
+ * is installed. Note that the returned revision is the revision of the
+ * Trusted OS, not of the API.
+ *
+ * Returns revision in 2 32-bit words in the same way as
+ * OPTEE_MSG_CALLS_REVISION described above.
+ */
+#define OPTEE_MSG_OS_OPTEE_REVISION_MAJOR	1
+#define OPTEE_MSG_OS_OPTEE_REVISION_MINOR	0
+#define OPTEE_MSG_FUNCID_GET_OS_REVISION	0x0001

Just for my understanding, what is the significance of these numbers,
i.e. which code (user space, kernel driver, trusted OS) provides
the uuid and which one provides the version? The code comments almost
make sense to me, but I don't see why specific versions are listed
in this header.

You're right, OPTEE_MSG_OS_OPTEE_REVISION_* should be removed. The
actual version the secure OS is of a mostly informational nature. The
same goes the OS UUID, but I suppose the actual UUID used by the
upstream version of OP-TEE OS could be interesting to know.

...

quoted

What is the expected behavior when one side reports a version that
is unknown? Can one side claim to be backwards compatible with
a previous version, or does each new version need support on
all three sides?

The UUID and version of the message protocol are important to match
correctly as otherwise it could mean that there's something unexpected
in secure world that following the message protocol would be undefined
behaviour. All changes to the message protocol should be backwards
compatible in the sense that the driver and secure world need to
negotiate eventual extensions while probing. That's what we're doing in
optee_msg_exchange_capabilities().

Ok, then maybe the "compatible" identifier in DT should be sufficient
to ensure that the capability exchange works, and the rest be based
on that?

We tend to avoid version checks for APIs in the kernel because they
never work in practice, but the capability check should be fine.

quoted

diff --git a/drivers/tee/optee/rpc.c b/drivers/tee/optee/rpc.c
new file mode 100644
index 000000000000..0b9c1a2accd0
--- /dev/null
+++ b/drivers/tee/optee/rpc.c
+static void handle_rpc_func_cmd_wq(struct optee *optee,
+				   struct optee_msg_arg *arg)
+{
+	struct optee_msg_param *params;
+
+	if (arg->num_params != 1)
+		goto bad;
+
+	params = OPTEE_MSG_GET_PARAMS(arg);
+	if ((params->attr & OPTEE_MSG_ATTR_TYPE_MASK) !=
+			OPTEE_MSG_ATTR_TYPE_VALUE_INPUT)
+		goto bad;
+
+	switch (params->u.value.a) {
+	case OPTEE_MSG_RPC_WAIT_QUEUE_SLEEP:
+		wq_sleep(&optee->wait_queue, params->u.value.b);
+		break;
+	case OPTEE_MSG_RPC_WAIT_QUEUE_WAKEUP:
+		wq_wakeup(&optee->wait_queue, params->u.value.b);
+		break;
+	default:
+		goto bad;
+	}
+
+	arg->ret = TEEC_SUCCESS;
+	return;
+bad:
+	arg->ret = TEEC_ERROR_BAD_PARAMETERS;
+}
+

I'm trying to understand what this is good for. What I can see is that
you have a user space process calling into the kernel asking the tee
to do some command, and then the tee can ask the kernel to wait for
something to happen, or notify it that something has happened.

If we wait here, the user process gets suspended until this has
actually happened.

Am I reading this correctly? If yes, what is the intended use case?
Is there some process that is meant to always wait here? What
if we ever need to wait for more than one thing at a time (think
select or poll?)

I'm updating the comments for OPTEE_MSG_RPC_CMD_WAIT_QUEUE with:
"If secure world need to wait for a secure world mutex it issues a sleep
request instead of spinning in secure world. Conversely is a wakeup
request issued when a secure world mutex with a thread waiting thread is
unlocked."

The way we're waking up a sleeping thread is a bit limiting in some
circumstances.

One case is where we need to do it from a secure interrupt handler.
Because there's no way of doing this kind of RPC to normal world from a
secure interrupt handler. In that case it wouldn't be mutex the thread
is waiting for though.

Another case is where there's several guest running in the system and
more than one guests has access to secure world. If guest2 waits for
mutex which guest1 is releasing, how can guest2 be notified? Normal RPC
is impossible here also.

The only way around this limitation I've come up with so far is by doing
the wakeup via a software generated interrupt destined to the correct
guest. However this problem is beyond the scope of this patch set.

Ok, I see.

	Arnd

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help