[PATCH V7 00/11] Support for generic ACPI based PCI host controller
From: Gabriele Paoloni <hidden>
Date: 2016-05-20 08:41:27
Also in:
linux-acpi, linux-pci, lkml
Hi Ard
-----Original Message----- From: Ard Biesheuvel [mailto:ard.biesheuvel at linaro.org] Sent: 20 May 2016 09:29 To: Jon Masters Cc: Tomasz Nowicki; Gabriele Paoloni; helgaas at kernel.org; arnd at arndb.de; will.deacon at arm.com; catalin.marinas at arm.com; rafael at kernel.org; hanjun.guo at linaro.org; Lorenzo.Pieralisi at arm.com; okaya at codeaurora.org; jchandra at broadcom.com; linaro- acpi at lists.linaro.org; linux-pci at vger.kernel.org; dhdang at apm.com; Liviu.Dudau at arm.com; ddaney at caviumnetworks.com; jeremy.linton at arm.com; linux-kernel at vger.kernel.org; linux-acpi at vger.kernel.org; robert.richter at caviumnetworks.com; Suravee.Suthikulpanit at amd.com; msalter at redhat.com; Wangyijing; mw at semihalf.com; andrea.gallo at linaro.org; linux-arm-kernel at lists.infradead.org Subject: Re: [PATCH V7 00/11] Support for generic ACPI based PCI host controller On 20 May 2016 at 10:01, Jon Masters [off-list ref] wrote:quoted
Hi Ard, On 05/20/2016 03:37 AM, Ard Biesheuvel wrote:quoted
On 20 May 2016 at 06:41, Jon Masters [off-list ref] wrote:quoted
Hi Tomasz, all, On 05/11/2016 07:08 AM, Tomasz Nowicki wrote:quoted
On 11.05.2016 12:41, Gabriele Paoloni wrote:quoted
quoted
quoted
v6 -> v7 - drop quirks handlingMaybe I missed something in the v6 discussion thread; when was it decided to drop quirk handling?I had such requests in previous series.A quick note on quirk handling. This, I believe, applies post-mergeofquoted
quoted
quoted
the base infrastructure, which I realize will likely not havequirks.quoted
quoted
quoted
We've some "gen1" ARMv8 server platforms where we end up doingquirksquoted
quoted
quoted
(for things like forcing 32-bit config space accessors and thelike) duequoted
quoted
quoted
to people repurposing existing embedded PCIe IP blocks or usingthem forquoted
quoted
quoted
the first time (especially in servers), and those being involved inthequoted
quoted
quoted
design not necessarily seeing this problem ahead of time, or not realizing that it would be an issue for servers. In the early daysofquoted
quoted
quoted
ARM server designs 3-4 years ago, many of us had never reallyplayedquoted
quoted
quoted
with ECAM or realized how modern topologies are built. Anyway. We missed this one in our SBSA requirements. They say(words toquoted
quoted
quoted
the effect of) "thou shalt do PCIe the way it is done on servers"butquoted
quoted
quoted
they aren't prescriptive, and they don't tell people how thatactuallyquoted
quoted
quoted
is in reality. That is being fixed. A lot of things are happeningbehindquoted
quoted
quoted
the scenes - especially with third party IP block providers (all ofwhomquoted
quoted
quoted
myself and others are speaking with directly about this) - toensurequoted
quoted
quoted
that the next wave of designs won't repeat these mistakes. We don'thavequoted
quoted
quoted
a time machine, but we can contain this from becoming an ongoingmessquoted
quoted
quoted
for upstream, and we will do so. It won't be a zoo. Various proposals have arisen for how to handle quirks in thelongerquoted
quoted
quoted
term, including elaborate frameworks and tables to describe them generically. I would like to caution against such approaches,especiallyquoted
quoted
quoted
in the case that they deviate from practice on x86, or prior tobeingquoted
quoted
quoted
standardized fully with other Operating System vendors. I don'texpectquoted
quoted
quoted
there to be too many more than the existing initial set of quirkswequoted
quoted
quoted
have seen posted. A number of "future" server SoCs have alreadybeenquoted
quoted
quoted
fixed prior to silicon, and new design starts are being warned nottoquoted
quoted
quoted
make this a problem for us to have to clean up later. So, I would like to suggest that the eventual framework mirror the existing approach on x86 systems (matching DMI, etc.) and not bemadequoted
quoted
quoted
into some kind of generic, utopia. This is a case where we wantthere toquoted
quoted
quoted
be pain involved (and upstream patches required) when people screwup,quoted
quoted
quoted
so that they have a level of pain in response to ever making this mistake in the future. If we try to create too grand a genericschemequoted
quoted
quoted
and make it too easy to handle this kind of situation beyond thesmallquoted
quoted
quoted
number of existing offenders, we undermine efforts to force vendorstoquoted
quoted
quoted
ensure that their IP blocks are compliant going forward.quoted
I understand that there is a desire from the RedHat side to mimicx86quoted
quoted
as closely as possible, but I never saw any technical justification for that.Understood. My own motivation is always to make the experience as familiar as possible, both for end users, as well as for ODMs and the entire ecosystem. There are very many ODMs currently working on v8 server designs and they're already expecting this to be "just likex88".quoted
Intentionally. But as to the specifics of using DMI...quoted
DMI contains strings that are visible to userland, and you effectively lock those down to certain values just so that thekernelquoted
quoted
can distinguish a broken PCIe root complex from a working one. Linux on x86 had no choice, since the overwhelming majority of existing hardware misrepresented itself as generic, and DMI was the onlythingquoted
quoted
available to actually distinguish these broken implementations from one another. This does not mean we should allow and/or encouragethisquoted
quoted
first gen hardware to misrepresent non-compliant hardware ascompliantquoted
quoted
as well.That's a very reasonable argument. I don't disagree that it would be nice to have nicer ways to distinguish the non-compliant IP than treating the whole platform with an ASCII matching sledgehammer.quoted
Since you are talking to all the people involved, how about you convince them to put something in the ACPI tables that allows the kernel to distinguish those non-standard PCIe implementations from hardware that is really generic?I'm open to this *BUT* it has to be something that will be adopted beyond Linux. I have reached out to some non-Linux folks about this.Ifquoted
there's buy-in, and if there's agreement to go standardize it through the ASWG, then we should do so. What we should not do is treat ARM as special in a way that the others aren't involved with. I'll admit DMI ended up part of the SBBR in part because I wrote that piece in withthequoted
assumption that exactly the same matches as on x86 would happen.Is the PCIe root complex so special that you cannot simply describe an implementation that is not PNP0408 compatible as something else, under its own unique HID? If everybody is onboard with using ACPI, how is this any different from describing other parts of the platform topology? Even if the SBSA mandates generic PCI, they already deviated from that when they built the hardware, so pretending that it is a PNP0408 with quirks really does not buy us anything.
From my understanding we want to avoid this as this would allow each
vendor to come up with his own code and it would be much more effort for the PCI maintainer to rework the PCI framework to accommodate X86 and "all" ARM64 Host Controllers... I guess this approach is too risky and we want to avoid this. Through standardization we can more easily maintain the code and scale it to multiple SoCs... So this is my understanding; maybe Jon, Tomasz or Lorenzo can give a bit more explanation... Thanks Gab
quoted
quoted
This way, we can sidestep the quirks debate entirely, since it will simply be a different device as farasquoted
quoted
the kernel is concerned. This is no worse than a quirk from a practical point of view, since an older OS will be equally unable to run on newer hardware, but it is arguably more true to the standards compliance you tend to preach about, especially since this smallpoolquoted
quoted
of third party IP could potentially be identified directly ratherthanquoted
quoted
based on some divination of the SoC we may or may not be running on.Iquoted
quoted
am also convinced that adding support for an additional HID() to the ACPI ECAM driver with some special config space handling wired in is an easier sell upstream than making the same ugly mess x86 has hadtoquoted
quoted
make because they did not have any choice to begin with.Again, open to it. I just don't want to do something that's Linux specific. So it'll take time. It would be awesome if an interim quirk solution existed that got platforms that are shipping in production (e.g. HP Moonshot) actually booting upstream kernels this year. We /really/ want F25 to be able to run on these without needing to carryanquoted
out-of-tree quirk patch or just not support them. That's much worse.My whole point is that we don't need quirks in the first place if non-compliant devices are not being misrepresented as compliant ones. This is fine for other platform devices, i.e., SATA, network, so again, why is PCIe so special that we *must* use a generic ID + quirks rather than a specific ID?quoted
quoted
If we do need a quirks handling mechanism, I still don't see how the x86 situation extrapolates to ARM. ACPI offers plenty of ways for a SoC vendor to identify the make and particular revision, and quirks could be keyed off of that.Fair enough. Is there any traction for an interim solution for these initial platforms do you think? If we wait to add a new table it's probably going to be the end of the year before we get this done.The 'interim solution' is to come to terms with the fact that these initial platforms are not SBSA compliant, contain a PCIe root complex that is not PNP0408 but can be identified by its own HID, and we make the software work with that. This means our message from the beginning is that, yes, you can have non-compliant hardware and the burden is on you to get it supported upstream, and no, you don't get to hang out with the cool SBSA kids if you decide to go that route