Thread (19 messages) 19 messages, 3 authors, 2012-11-28

Re: [PATCH v6 2/6] PM / Runtime: introduce pm_runtime_set_memalloc_noio()

From: Ming Lei <hidden>
Date: 2012-11-28 13:51:17
Also in: linux-mm, lkml, netdev

On Wed, Nov 28, 2012 at 6:06 PM, Rafael J. Wysocki [off-list ref] wrote:
Well, it may be unfrequent, but does it mean it has to do things that may
be avoided (ie. walking the children of every node in the path in some cases)?
I agree so without introducing extra cost, :-)
I don't really think that the counters would cost us that much anyway.
On ARM v7, sizeof(struct device) becomes 376 from 368 after introducing
'unsigned int            noio_cnt;' to 'struct dev_pm_info', and total memory
increases about 3752bytes in a small configuration(about 494 device instance).
The actual memory increase should be more than the data because 'struct device'
is generally embedded into other concrete device structure.
quoted
Also looks the current implementation of pm_runtime_set_memalloc_noio()
is simple and clean enough with the flag, IMO.
I know you always know better. :-)
We still need to consider cost and the function calling frequency, :-)
quoted
quoted
I would use the flag only to store the information that
pm_runtime_set_memalloc_noio(dev, true) has been run for this device directly
and I'd use a counter for everything else.

That is, have power.memalloc_count that would be incremented when (1)
pm_runtime_set_memalloc_noio(dev, true) is called for that device and (2) when
power.memalloc_count for one of its children changes from 0 to 1 (and
analogously for decrementation).  Then, check the counter in rpm_callback().
Sorry, could you explain in a bit detail why we need the counter? Looks only
checking the flag in rpm_callback() is enough, doesn't it?
Why would I want to use power.memalloc_count in addition to the
power.memalloc_noio flag?

Consider this:

pm_runtime_set_memalloc_noio(dev):
        return if power.memalloc_noio is set
        set power.memalloc_noio
  loop:
        increment power.memalloc_count
        if power.memalloc_count is 1 now switch to parent and go to loop
I am wondering if the above should be changed to below because the child
count of memalloc_noio device need to be recorded.

pm_runtime_set_memalloc_noio(dev):
         return if power.memalloc_noio is set
         set power.memalloc_noio
loop:
         increment power.memalloc_count
         switch to parent and go to loop

So pm_runtime_set_memalloc_noio(dev) will become worse than
the improved pm_runtime_set_memalloc_noio(dev, true), which
can return immediately if one dev or parent's flag is true.
pm_runtime_clear_memalloc_noio(dev):
        return if power.memalloc_noio is unset
        unset power.memalloc_noio
  loop:
        decrement power.memalloc_count
        if power.memalloc_count is 0 now switch to parent and go to loop
The above will perform well than pm_runtime_set_memalloc_noio(dev, false),
because the above avoids to walk children of device.

So one becomes worse and another becomes better, :-)

Also the children count of one device is generally very small, less than
10 for most devices, see the data obtained in one common x86 pc(thinkpad
t410) from below link:

        http://kernel.ubuntu.com/~ming/up/t410-dev-child-cnt.log

- about 8 devices whose child count is more than 10, top three are 18, 17 ,12,
and all the three are root devices.

- about 117 devices whose child count is between 1 and 9

- other 501 devices whose child count is zero
From above data, walking device children should have not much effect on
performance of pm_runtime_set_memalloc_noio(), which is also called in
very infrequent path.
Looks kind of simpler, doesn't it?
Looks simpler, but more code lines than single
pm_runtime_set_memalloc_noio(), :-)
And why rpm_callback() should check power.memalloc_count instead of the count?
Because power.memalloc_noio will only be set for devices that
pm_runtime_set_memalloc_noio(dev) was called for directly (not necessarily for
the parents).

And that works even if someone calls any of them twice in a row for the same
device (presumably by mistake) and doesn't have to make any assumptions
about devices it is called for.
IMO, we can ignore the mistake usage because the function is called only
in network/block core code currently, not by individual driver.
quoted
quoted
Besides, don't you need to check children for the arg device itself?
It isn't needed since the children of network/block device can't be
involved of the deadlock in runtime PM path.

Also, the function is only called by network device or block device
subsystem, both the two kind of device are class device and should
have no children.
OK, so not walking the arg device's children is an optimization related to
some assumptions regarding who's supposed to use this routine.  That should
be clearly documented.
I think the patch already documents it in the comment of
pm_runtime_set_memalloc_noio().

Thanks,
--
Ming Lei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help