Thread (38 messages) 38 messages, 5 authors, 2020-07-28

Re: [PATCH v4 06/20] coresight: add try_get_module() in coresight_grab_device()

From: Mike Leach <hidden>
Date: 2020-07-28 10:10:31

Hi Tingwei,

On Tue, 28 Jul 2020 at 02:07, Tingwei Zhang [off-list ref] wrote:
Hi Mike,

On Tue, Jul 28, 2020 at 01:04:21AM +0800, Mike Leach wrote:
quoted
Hi Tingwei,

On Sun, 26 Jul 2020 at 02:32, Tingwei Zhang [off-list ref]
wrote:
quoted
Hi Mike,
On Sat, Jul 25, 2020 at 09:51:31PM +0800, Mike Leach wrote:
quoted
Hi Tingwei,

On Sat, 25 Jul 2020 at 11:05, Tingwei Zhang [off-list ref]
wrote:
quoted
Hi Mike,

 Thu, Jul 23, 2020 at 06:55:47PM +0800, Mike Leach wrote:
quoted
Hi Tingwei,

On Thu, 23 Jul 2020 at 05:28, Tingwei Zhang
[off-list ref]
quoted
quoted
quoted
quoted
wrote:
quoted
When coresight device is in an active session, driver module of
that device should not be removed. Use try_get_module() in
coresight_grab_device() to prevent module to be unloaded.

Signed-off-by: Tingwei Zhang <redacted>
---
 drivers/hwtracing/coresight/coresight.c | 27
+++++++++++++++++++++----
quoted
quoted
quoted
 1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight.c
b/drivers/hwtracing/coresight/coresight.c
quoted
index b7151c5f81b1..17bc76ea86ae 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -640,7 +640,7 @@ struct coresight_device
*coresight_get_sink_by_id(u32 id)
quoted
  * don't appear on the trace path, they should be handled along
with
quoted
quoted
the
quoted
  * the master device.
  */
-static void coresight_grab_device(struct coresight_device
*csdev)
quoted
quoted
quoted
quoted
quoted
+static int coresight_grab_device(struct coresight_device
*csdev)
quoted
quoted
quoted
quoted
quoted
 {
        int i;
@@ -648,10 +648,25 @@ static void coresight_grab_device(struct
coresight_device *csdev)
quoted
                struct coresight_device *child;

                child  = csdev->pdata->conns[i].child_dev;
-               if (child && child->type ==
CORESIGHT_DEV_TYPE_HELPER)
quoted
quoted
quoted
+               if (child && child->type ==
CORESIGHT_DEV_TYPE_HELPER) {
quoted
quoted
quoted
+                       if
(!try_module_get(child->dev.parent->driver->owner))
quoted
+                               goto err;
                        pm_runtime_get_sync(child->dev.parent);
+               }
        }
+       if (!try_module_get(csdev->dev.parent->driver->owner))
+               goto err;
        pm_runtime_get_sync(csdev->dev.parent);
+       return 0;
+err:
+       for (i--; i >= 0; i--) {
+               struct coresight_device *child;
+
+               child  = csdev->pdata->conns[i].child_dev;
+               if (child && child->type ==
CORESIGHT_DEV_TYPE_HELPER)
quoted
quoted
quoted
+                       pm_runtime_put(child->dev.parent);
+       }
+       return -ENODEV;
 }

 /*
@@ -663,12 +678,15 @@ static void coresight_drop_device(struct
coresight_device *csdev)
quoted
        int i;

        pm_runtime_put(csdev->dev.parent);
+       module_put(csdev->dev.parent->driver->owner);
        for (i = 0; i < csdev->pdata->nr_outport; i++) {
                struct coresight_device *child;

                child  = csdev->pdata->conns[i].child_dev;
-               if (child && child->type ==
CORESIGHT_DEV_TYPE_HELPER)
quoted
quoted
quoted
+               if (child && child->type ==
CORESIGHT_DEV_TYPE_HELPER) {
quoted
quoted
quoted
                        pm_runtime_put(child->dev.parent);
+
module_put(child->dev.parent->driver->owner);
quoted
quoted
quoted
+               }
        }
 }
@@ -721,7 +739,8 @@ static int _coresight_build_path(struct
coresight_device *csdev,
quoted
        if (!node)
                return -ENOMEM;

-       coresight_grab_device(csdev);
+       if (coresight_grab_device(csdev))
+               return -ENODEV;
        node->csdev = csdev;
        list_add(&node->link, path);

--
The Qualcomm Innovation Center, Inc. is a member of the Code
Aurora
quoted
quoted
quoted
quoted
Forum,
quoted
a Linux Foundation Collaborative Project
The CTI devices are associated with other coresight components
using
quoted
quoted
quoted
quoted
csdev->ect_dev; These are not handled in the main linear trace
path as
quoted
quoted
quoted
quoted
these have a star topology interlinking all components.
However, when a component has an associated ect_dev then is
enabled at
quoted
quoted
quoted
quoted
the same time as the other component is enabled.

So a module_get/put will be needed for this association to prevent
the
quoted
quoted
quoted
quoted
CTI being removed while a trace session is active.
I suggest in coresight.c : coresight_control_assoc_ectdev()
In module unload process, devices of that module will be removed. In
this case, cti_remove() is called on all cti device when unloading
cti module. All cti connections is cleaned at that time.
csdev->ect_dev is set to NULL. coresight_control_assoc_ectdev()
will return since since ect_csdev is NULL.

I think we are safe without holding module reference since cti
driver has done a pretty good job on clean up.
The issue here is not about clean up. We need to keep all the
programmed coresight modules loaded and operational while a coresight
trace session is in progress. The CTI can be used to control the
generation of trace, trigger trace related events etc.
If the module is removed while a session is in progress then this
programming is lost and the trace data collected may not be what is
expected.
Got your point now. In my opinion, CTI is kinda different to other
coresight components since it's optional.

For other components, they have to been there when path is built or
the enablement fails. Use can either get a successfully return which
means it's good and all devices/modules are there until path is
released, or a failed return which means trace session can't be
setup.
The module get/put do more than ensure that the system can be enabled.
They ensure that no components in the coresight path can be unloaded
until all paths in use are released. This ensures that clients do not
get the trace system pulled out from underneath - corrupting trace
data, and possibly affecting the client program.
I agree.
quoted
quoted
In CTI case, there's no garrantee that CTI related to the trace
session is there even the enablement is successful.
And there is no guarantee that it is not programmed to perform a key
task in controlling the generation of trace.

The entire programmed coresight system must be protected when in use.
That includes CTI devices. Otherwise the integrity of the captured
trace could be compromised, and indeed there may be no trace captured
if this is relying on CTI triggers.
That's correct.  If CTI is not there, the trace may be wrong or no
trace as all.
quoted
quoted
For example,
One cti is required for ETM tracing session. If cti module is
unloaded before enable that trace session. The enablement will
still be successfully done since there's no ect_dev connected
at all.

My point is holding cti module reference in enable path won't
fix all the problem. Altenatively, user should be aware that
unload cti module leads to all cti functioniol failure.
With current design, the behavior is consistant on cti. User
can unload cti module whenever he wants but it will break
the function.
Precisely - and this is what should never be allowed - why can the CTI
be used to break the trace system but not other components?
Sorry - I may not have been clear here - I do not want to make having
a CTI mandatory for enabling the trace session - this is an optional
component.
We can't prevent user to unload CTI module when there's no active
trace session.
Agreed - the same applies to all the other components.
If we don't allow CTI to be not available, we need to
check whether CTI is there in build_path routine or coresight enable
routine. We need return failure if CTI is not there when user enables
the trace session. Currently, coresight framework check associate CTI
device, but if CTI deivce is not probed due to probe failure or
module unload, coresight framework assume there's no assocaite CTI
device and just continue the enablement.  It won't return failure.

If we have plan to change CTI check to be mandatory, make the coresight
enablement fail if CTI is not there like other trace components.  I'm
perfect fine to add module_get/module_put for CTI.
If the CTI is present at the time the trace path is created, then it
must be enabled, and a failure to enable the CTI fails the entire path
enable - this is the operation that is implemented by
coresight_control_assoc_ectdev() at this time.
If a CTI is not present at the time the trace path is created, then as
you have said previously, the ect_dev pointer will be NULL and the
check and enable of CTI will be skipped by
coresight_control_assoc_ectdev() and have no effect on the trace path.

For the module version, if CTI is present and enabled then the module
must be locked so it cannot be removed while the trace is active -
this is the only change that is required. So adding get / put module
in coresight_control_assoc_ectdev() will lock the module in for the
duration that trace is active.

I've tested the code below to ensure that the cti module is correctly
locked while trace is active, but can be removed as required when
trace is inactive.

Regards

Mike

---------------------------------------------------------------------------------
static int
coresight_control_assoc_ectdev(struct coresight_device *csdev, bool enable)
{
    int ect_ret = 0;
    struct coresight_device *ect_csdev = csdev->ect_dev;
    struct module *mod;

    if (!ect_csdev)
        return 0;

    if ((!ect_ops(ect_csdev)->enable) || (!ect_ops(ect_csdev)->disable))
        return 0;

    mod = ect_csdev->dev.parent->driver->owner;

    if (enable) {
        if (try_module_get(mod)) {
            ect_ret = ect_ops(ect_csdev)->enable(ect_csdev);
            if (ect_ret)
                module_put(mod);
        } else
            ect_ret = -ENODEV;
    } else {
        module_put(mod);
        ect_ret = ect_ops(ect_csdev)->disable(ect_csdev);
    }

    /* output warning if ECT enable is preventing trace operation */
    if (ect_ret)
        dev_info(&csdev->dev, "Associated ECT device (%s) %s failed\n",
             dev_name(&ect_csdev->dev),
             enable ? "enable" : "disable");
    return ect_ret;
}
-----------------------------------------------------------------------------------------
Tests:-
-----------------------
root@linaro-developer:/home/linaro/scripts# ./trace-sysfs-start.bash
Tracing via sysfs
enabling etf
wait before enable etm
enabling etm 0.
enabling etm 1.
enabling etm 2.
enabling etm 3.
waiting for trace
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-etm4x.ko
rmmod: ERROR: Module coresight_etm4x is in use
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-cti.ko
rmmod: ERROR: Module coresight_cti is in use
root@linaro-developer:/home/linaro/scripts# ./trace-sysfs-end.bash
ending trace
disabling etm 0.
disabling etm 1.
disabling etm 2.
disabling etm 3.
disabling etf
16+0 records in
16+0 records out
8192 bytes (8.2 kB) copied, 0.000834116 s, 9.8 MB/s
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-cti.ko
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-etm4x.ko
root@linaro-developer:/home/linaro/scripts# insmod ../cs-mods/coresight-etm4x.ko
root@linaro-developer:/home/linaro/scripts# ./trace-sysfs-start.bash
Tracing via sysfs
enabling etf
wait before enable etm
enabling etm 0.
enabling etm 1.
enabling etm 2.
enabling etm 3.
waiting for trace
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-etm4x.ko
rmmod: ERROR: Module coresight_etm4x is in use
root@linaro-developer:/home/linaro/scripts# rmmod ../cs-mods/coresight-cti.ko
rmmod: ERROR: Module coresight_cti is not currently loaded
root@linaro-developer:/home/linaro/scripts# ./trace-sysfs-end.bash
ending trace
disabling etm 0.
disabling etm 1.
disabling etm 2.
disabling etm 3.
disabling etf
16+0 records in
16+0 records out
8192 bytes (8.2 kB) copied, 0.000819117 s, 10.0 MB/s

}


quoted
Moving forwards we will see users who simply use pre-programmed
settings from client programs such as perf. It may be that some users
do not have the credentials to add and remove modules - this could be
an admin function where larger systems with multiple sessions are in
use.


Regards

Mike

quoted
Thanks, Tingwei
quoted
Regards

Mike
quoted
Let me know if you still think we need hold the module reference.

Thanks, Tingwei
quoted
Regards

Mike


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help