Thread (33 messages) 33 messages, 4 authors, 2018-10-19

Re: [PATCH v4 01/18] of: overlay: add tests to validate kfrees from overlay removal

From: Alan Tull <atull@kernel.org>
Date: 2018-10-18 20:25:35
Also in: linux-devicetree, linux-fpga, lkml

On Wed, Oct 17, 2018 at 4:30 PM Alan Tull [off-list ref] wrote:
On Mon, Oct 15, 2018 at 9:39 PM [off-list ref] wrote:

Hi Frank,
quoted
From: Frank Rowand <redacted>

Add checks:
  - attempted kfree due to refcount reaching zero before overlay
    is removed
  - properties linked to an overlay node when the node is removed
  - node refcount > one during node removal in a changeset destroy,
    if the node was created by the changeset

After applying this patch, several validation warnings will be
reported from the devicetree unittest during boot due to
pre-existing devicetree bugs. The warnings will be similar to:

  OF: ERROR: of_node_release() overlay node /testcase-data/overlay-node/test-bus/test-unittest11/test-unittest111 contains unexpected properties
  OF: ERROR: memory leak - destroy cset entry: attach overlay node /testcase-data-2/substation@100/hvac-medium-2 expected refcount 1 instead of 2.  of_node_get() / of_node_put() are unbalanced for this node.

Signed-off-by: Frank Rowand <redacted>
---
Changes since v3:
  - Add expected value of refcount for destroy cset entry error.  Also
    explain the cause of the error.

 drivers/of/dynamic.c | 29 +++++++++++++++++++++++++++++
 drivers/of/overlay.c |  1 +
 include/linux/of.h   | 15 ++++++++++-----
 3 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index f4f8ed9b5454..24c97b7a050f 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -330,6 +330,25 @@ void of_node_release(struct kobject *kobj)
        if (!of_node_check_flag(node, OF_DYNAMIC))
                return;

+       if (of_node_check_flag(node, OF_OVERLAY)) {
+
+               if (!of_node_check_flag(node, OF_OVERLAY_FREE_CSET)) {
+                       /* premature refcount of zero, do not free memory */
+                       pr_err("ERROR: memory leak %s() overlay node %pOF before free overlay changeset\n",
+                              __func__, node);
+                       return;
+               }
+
+               /*
+                * If node->properties non-empty then properties were added
+                * to this node either by different overlay that has not
+                * yet been removed, or by a non-overlay mechanism.
+                */
+               if (node->properties)
+                       pr_err("ERROR: %s() overlay node %pOF contains unexpected properties\n",
+                              __func__, node);
+       }
+
        property_list_free(node->properties);
        property_list_free(node->deadprops);
@@ -434,6 +453,16 @@ struct device_node *__of_node_dup(const struct device_node *np,

 static void __of_changeset_entry_destroy(struct of_changeset_entry *ce)
 {
+       if (ce->action == OF_RECONFIG_ATTACH_NODE &&
+           of_node_check_flag(ce->np, OF_OVERLAY)) {
+               if (kref_read(&ce->np->kobj.kref) > 1) {
+                       pr_err("ERROR: memory leak - destroy cset entry: attach overlay node %pOF expected refcount 1 instead of %d.  of_node_get() / of_node_put() are unbalanced for this node.\n",
+                              ce->np, kref_read(&ce->np->kobj.kref));
Still testing as much as I have time to do.

I'm hitting this error message once when removing an overlay that adds
several child nodes.  The only node I get the message for was a node
that added a fixed-clock (the other nodes didn't trigger the error).
Then even if I edited all the rest of the overlay DTS and removed all
other child nodes and all references to the clock from other nodes, I
still got the error.

Removing dtbo: 1-socfpga_arria10_socdk_sdmmc_ghrd_ovl_ext_cfg.dtb
[   72.032270] OF: ERROR: memory leak - destroy cset entry: attach
overlay node /soc/base_fpga_region/clk_0 expected refcount 1 instead
of 2.  of_node_get() / of_node_put() are unbalanced for this node.
Update: with some helpful offline debug patches from Frank, I was able
to find the source of the of_node_get/put unbalance.  The fixed-rate
clock driver calls of_clk_add_provider() when probed but never calls
of_clk_del_provider()

This patchset quite likely will uncover other of_node_get/put
unbalances around the kernel.

Alan
Here's the very stripped down overlay:

/dts-v1/;
/plugin/;
/ {
        fragment@0 {
                target-path = "/soc/base_fpga_region";
                #address-cells = <1>;
                #size-cells = <1>;

                __overlay__ {
                        external-fpga-config;

                        #address-cells = <1>;
                        #size-cells = <1>;

                        clk_0: clk_0 {
                                compatible = "fixed-clock";
                                #clock-cells = <0>;
                                clock-frequency = <100000000>;  /* 100.00 MHz */
                                clock-output-names = "clk_0-clk";
                        };
                };
        };
};
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help