Thread (26 messages) 26 messages, 6 authors, 2023-08-03

Re: Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"

From: 陈华才 <hidden>
Date: 2023-06-02 04:06:34
Also in: linux-pci, lkml

+Cc Jianmin Lv

Hi, Jianmin,

You are the most familiar person in this field, could you please give some suggestions? Thank you.

Huacai

-----原始邮件-----
发件人: "Vladimir Oltean" [off-list ref]
发送时间:2023-06-02 06:15:32 (星期五)
收件人: "Bjorn Helgaas" [off-list ref]
抄送: linux-pci@vger.kernel.org, netdev@vger.kernel.org, "Bjorn Helgaas" [off-list ref], "Rob Herring" [off-list ref], "Claudiu Manoil" [off-list ref], "Michael Walle" [off-list ref], linux-kernel@vger.kernel.org, "Liu Peibao" [off-list ref], "Binbin Zhou" [off-list ref], "Huacai Chen" [off-list ref]
主题: Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"

On Thu, Jun 01, 2023 at 12:51:39PM -0500, Bjorn Helgaas wrote:
quoted
quoted
quoted
Doing it in Linux would minimize dependences on the bootloader, so
that seems desirable to me. That means Linux needs to enumerate
Function 0 so it is visible to a driver or possibly a quirk.
Uhm... no, that wouldn't be enough. Only a straight revert would satisfy
the workaround that we currently have for NXP ENETC in Linux.
I guess you mean a revert of 6fffbc7ae137?
Yes.
quoted
This whole conversation is about whether we can rework 6fffbc7ae137 to
work both for Loongson and for you, so nothing is decided yet.
After reading
https://lore.kernel.org/linux-pci/20221117020935.32086-1-liupeibao@loongson.cn/ (local)
and
https://lore.kernel.org/linux-pci/20221103090040.836-1-liupeibao@loongson.cn/ (local)
and seeing the GMAC OF node at arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi,
I believe that a solution that would work for both Loongson and NXP would be to:

- patch loongson_dwmac_probe() to check for of_device_is_available()
- revert commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled
  status")

I'm not sure what else of what was concretely proposed would work.
Anything else is just wishful thinking that the PCI core can start
enforcing a central policy, after letting device drivers get to choose
how (and whether) to treat the "status" OF property for years on end.

As an added benefit, the disabled GMAC would become visible in lspci for
the Loongson SoC.
quoted
The point is, I assume you agree that it's preferable if we don't have
to depend on a bootloader to clear the memory.
I am confused by the message you are transmitting here.

With my user hat on, yes, maintaining the effect of commit 3222b5b613db
from Linux is preferable.

Although Rob will probably not be happy about the way in which that will
be achieved. And you haven't proposed ways in which that would remain
possible, short of a revert of commit 6fffbc7ae137.
quoted
After 6fffbc7ae137, the probe function is not called if the device is
disabled in DT because there's no pci_dev for it at all.
Correct, but commit 3222b5b613db pre-dates it by 2 years, and thus, it
is broken by Rob's change.
quoted
quoted
My problem is that I don't really understand what was the functional
need for commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled
status") in the first place, considering that any device driver can
already fail to probe based on the same condition at its own will.
In general, PCI drivers shouldn't rely on DT.  If the bus driver (PCI
in this case) calls a driver's probe function, the driver can assume
the device exists.
Well, the device exists...
quoted
But enetc is not a general-purpose driver, and if DT is the only way
to discover this property, I guess you're stuck doing that.
So what Loongson tried to do - break enumeration of the on-chip GMAC
PCIe device at the level of the PCIe controller, if the GMAC's pinmuxing
doesn't make it available for networking - is encouraged?

Do you consider that their patch would have been better in the original
form, if instead of the "skip-scan" property, they would have built some
smarts into drivers/pci/controller/pci-loongson.c which would intentionally
break config space access to gmac@3,0, without requiring OF to specify this?

Are you saying that this "present but unusable due to pinmuxing" is an
incorrect use of status = "disabled"? What would it constitute correct
use of, then?

The analogous situation for ENETC would be to patch the "pci-host-ecam-generic"
driver to read the SERDES and pinmuxing configuration of the SoC, and to
mask/unmask the config access to function 0 based on that. I mean - I could...
but is it really a good idea? The principle of separation of concerns
tells me no. The fact that the pinmuxing of the device makes it unavailable
pertains to the IP-specific logic, it doesn't change whether it's enumerable
or accessible on its bus.
quoted
quoted
quoted
Is DT the only way to learn the NXP SERDES configuration?  I think it
would be much better if there were a way to programmatically learn it,
because then you wouldn't have to worry about syncing the DT with the
platform configuration, and it would decouple this from the Loongson
situation.
Syncing the DT with the platform configuration will always be necessary,
because for networking we will also need extra information which is
completely non-discoverable, like a phy-handle or such, and that depends
on the wiring and static pinmuxing of the SoC. So it is practically
reasonable to expect that what is usable has status = "okay", and what
isn't has status = "disabled". Not to mention, there are already device
trees in circulation which are written that way, and those need to
continue to work.
Just because we need DT for non-discoverable info A doesn't mean we
should depend on it for B if B *is* discoverable.
But the argument was: we already have device trees with a certain
convention, and that is to expect having status = "disabled" for
unusable ports. I don't believe that changing that is realistically in
scope for fixing this. And if we have device trees with status =
"disabled" in circulation which we (I) don't want to break, then we're
back to square 1 regarding the probing of disabled devices.
quoted
This question of disabling a device via DT but still needing to do
things to the device is ... kind of a sticky wicket.
It boils down to whether accessing a disabled device is permitted or
not. I opened the devicetree specification and it didn't say anything
conclusive. Though it's certainly above my pay grade to say anything
with certainty in this area. Apart from "okay" and "disabled", "status"
takes other documented values too, like "reserved", "fail" and
"fail-sss". Linux treats everything that's not "okay" the same.
Krzysztof Kozlowski came with the suggestion for Loongson to replace
"skip-scan" with "status", during the review of their v1 patch.

In any case, that question will only recur one level lower - in U-Boot,
where we make an effort to keep device trees in sync in Linux. Why would
U-Boot need to do things to a disabled device? :)
quoted
Maybe this should be a different DT property (not "status").  Then PCI
enumeration could work normally and 6fffbc7ae137 wouldn't be in the
way.
I'm not quite sure where you're going with this. More concretely?

本邮件及其附件含有龙芯中科的商业秘密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制或散发)本邮件及其附件中的信息。如果您错收本邮件,请您立即电话或邮件通知发件人并删除本邮件。 
This email and its attachments contain confidential information from Loongson Technology , which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it. 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help