Thread (3 messages) 3 messages, 2 authors, 2018-02-26
STALE3021d

[PATCH] PCIe bridge deferred probe breaks suspend resume

From: rafael@kernel.org (Rafael J. Wysocki)
Date: 2018-02-25 09:29:47
Also in: linux-pci, linux-pm

Possibly related (same subject, not in this thread)

On Sat, Feb 24, 2018 at 2:59 PM, Lukas Wunner [off-list ref] wrote:
On Sat, Feb 24, 2018 at 10:47:26AM +0100, Rafael J. Wysocki wrote:
quoted
On Sat, Feb 24, 2018 at 10:35 AM, Rafael J. Wysocki [off-list ref] wrote:
quoted
On Sat, Feb 24, 2018 at 1:46 AM, Feng Kan [off-list ref] wrote:
quoted
This is not a patch, but rather a question regarding the deferred
probe's effect on PCIe PM ordering. This happens on our system
which defer the probing of root bridge due to  the IOMMU not being
ready. Because of the deferred action, the bridge is moved to the
end of the dpm_list which results in incorrect suspend and resume
sequence.

In the cases I have seen, the bridge is always reordered because of
startup sequence. They are always place after the endpoint. If that
is the case the following code should be able to prevent such cases.
However, is there some cases here that would violate such situation?
The code in dd.c assumes that the device being probed has no children
(or consumers, for that matter) and so it is safe to move it to the
end of the list.

If the device has children (or consumers), moving it to the end of the
list by itself doesn't work, which is the case for you.

You can try to replace the device_pm_move_last(dev) in
deferred_probe_work_func(struct() with device_reorder_to_tail(), but
that has to be called under device_links_read_lock/unloc () and
device_pm_lock/unloc() (in the right order).
Alternatively, you can replace your !dev_is_pci(dev) check with a
check for the presence of children or consumers and only move the
device to the end of the PM list if there are not any.  That would
kind of make sense, but it may break other assumptions in the deferred
probing mechanism which I don't recall ATM.

Or avoid deferred probing of the host bridge driver entirely.  I guess
you can use it with limited functionality until the IOMMU driver is
ready and switch over to the fully functional operation mode when that
happens, but that would need to hook into the IOMMU code somehow.
Or model the root bridge's dependency on the iommu using a device link:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html
Apparently, there are children registered under the bridge device
before the driver for it is probed.

That is kind of unusual and I'm not really sure why it happens at all,
because (in theory) the bridge driver should be responsible for
assigning resources to the devices on the bus, among other things.
Depending on what the reason is, your suggestion may or may not work.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help