[PATCH] PCI: Add information about describing PCI in ACPI
From: Gabriele Paoloni <hidden>
Date: 2016-11-22 13:14:11
Also in:
linux-acpi, linux-pci, lkml
Subsystem:
pci subsystem, the rest · Maintainers:
Bjorn Helgaas, Linus Torvalds
Hi Bjorn
-----Original Message----- From: Bjorn Helgaas [mailto:helgaas at kernel.org] Sent: 21 November 2016 20:10 To: Gabriele Paoloni Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux- acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm- kernel at lists.infradead.org; linaro-acpi at lists.linaro.org Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:quoted
Hi Bjornquoted
-----Original Message----- From: linux-pci-owner at vger.kernel.org [mailto:linux-pci- owner at vger.kernel.org] On Behalf Of Bjorn Helgaas Sent: 21 November 2016 16:47 To: Gabriele Paoloni Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux- acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm- kernel at lists.infradead.org; linaro-acpi at lists.linaro.org Subject: Re: [PATCH] PCI: Add information about describing PCI inACPIquoted
quoted
On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:quoted
Hi Bjornquoted
-----Original Message----- From: Bjorn Helgaas [mailto:helgaas at kernel.org] Sent: 18 November 2016 17:54 To: Gabriele Paoloni Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux- acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm- kernel at lists.infradead.org; linaro-acpi at lists.linaro.org Subject: Re: [PATCH] PCI: Add information about describing PCIinquoted
quoted
ACPIquoted
quoted
On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloniwrote:quoted
quoted
quoted
quoted
quoted
quoted
-----Original Message----- From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-quoted
quoted
quoted
quoted
quoted
quoted
owner at vger.kernel.org] On Behalf Of Bjorn Helgaas Sent: 17 November 2016 18:00quoted
quoted
+Static tables like MCFG, HPET, ECDT, etc., are *not*mechanismsquoted
quoted
forquoted
quoted
+reserving address space! The static tables are for thingsthequoted
quoted
OSquoted
quoted
quoted
quoted
+needs to know early in boot, before it can parse the ACPInamespace.quoted
quoted
+If a new table is defined, an old OS needs to operatecorrectlyquoted
quoted
evenquoted
quoted
+though it ignores the table. _CRS allows that because itisquoted
quoted
quoted
quoted
genericquoted
quoted
+and understood by the old OS; a static table does not.Right so if my understanding is correct you are saying thatresourcesquoted
quoted
quoted
described in the MCFG table should also be declared inPNP0C02quoted
quoted
quoted
quoted
devicesquoted
so that the PNP driver can reserve these resources.Yes.quoted
On the other side the PCI Root bridge driver should notreservequoted
quoted
suchquoted
quoted
quoted
resources. Well if my understanding is correct I think we have a problemhere:quoted
quoted
quoted
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74 As you can see pci_ecam_create() will conflict with the pnpdriverquoted
quoted
quoted
as it will try to reserve the resources from the MCFGtable...quoted
quoted
quoted
quoted
quoted
Maybe we need to rework pci_ecam_create() ?I think it's OK as it is. The pnp/system.c driver does try to reserve PNP0C02 resources,andquoted
quoted
itquoted
quoted
marks them as "not busy". That way they appear in /proc/iomemandquoted
quoted
quoted
quoted
won't be allocated for anything else, but they can still berequestedquoted
quoted
by drivers, e.g., pci/ecam.c, which will mark them "busy". This is analogous to what the PCI core does inpci_claim_resource().quoted
quoted
This is really a function of the ACPI/PNP *core*, which shouldreservequoted
quoted
all _CRS resources for all devices (not just PNP0C02 devices).Butquoted
quoted
quoted
quoted
it's done by pnp/system.c, and only for PNP0C02, becausethere's aquoted
quoted
quoted
quoted
bunch of historical baggage there. You'll also notice that in this case, things are out of order: logically the pnp/system.c reservation should happen first, butinquoted
quoted
quoted
quoted
fact the pci/ecam.c request happens *before* the pnp/system.cone.quoted
quoted
quoted
quoted
That means the pnp/system.c one might fail and complain "[mem...]quoted
quoted
quoted
quoted
could not be reserved".Correct me if I am wrong... So currently we are relying on the fact that pci_ecam_create() iscalledquoted
before the pnp driver. If the pnp driver came first we would end up in pci_ecam_create()failingquoted
here: http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76 I am not sure but it seems to me like a bit weak condition torelyquoted
quoted
on...quoted
what about removing the error condition in pci_ecam_create() andloggingquoted
just a dev_info()?Huh. I'm confused. I *thought* it would be safe to reverse the order, which would effectively be this: system_pnp_probe reserve_resources_of_dev reserve_range request_mem_region([mem 0xb0000000-0xb1ffffff]) ... pci_ecam_create request_resource_conflict([mem 0xb0000000-0xb1ffffff]) but I experimented with the patch below on qemu, and it failed asyouquoted
quoted
predicted: ** res test ** requested [mem 0xa0000000-0xafffffff] can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict withECAMquoted
quoted
PNP [mem 0xa0000000-0xafffffff] I expected the request_resource_conflict() to succeed since it's completely contained in the "ECAM PNP" region. But I guess I don't understand kernel/resource.c well enough.I think it fails because effectively the PNP driver is populating the iomem_resource resource tree and therefore pci_ecam_create() findsthatquoted
it cannot add the cfg resource to the same hierarchy as it is already there...Right. I'm just surprised because the PNP reservation is marked "not busy", and a driver (e.g., ECAM) should still be able to request the resource.
Yes unfortunately pci_ecam_create() is not flexible on the conflict as pci_request_regions(): http://lxr.free-electrons.com/source/kernel/resource.c#L1155 if the conflict resource is not busy pci_request_regions() will create a child resource under the conflict sibling and mark it as busy... or at least this is my understanding...
quoted
quoted
I'm not sure we need to fix anything yet, since we currently do the ecam.c request before the system.c one, and any change there wouldbequoted
quoted
a long ways off. If/when that *does* change, I think the correctfixquoted
quoted
would be to change ecam.c so its request succeeds (by changing thewayquoted
quoted
it does the request, fixing kernel/resource.c, or whatever) rather than to reduce the log level and ignore the failure.Well in my mind I didn't want just to make the error disappear... If all the resources should be reserved by the PNP driver thenideallyquoted
we could take away request_resource_conflict() frompci_ecam_create(),quoted
but this would make buggy some systems with an already shipped BIOS that relied on pci_ecam_create() reservation rather than PNPreservation. I don't want remove the request from ecam.c. Ideally, there should be TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or whatever it is, and a second from ecam.c. The first is the generic one saying "this region is consumed by a piece of hardware, so don't put anything else here." The second is the driver-specific one saying "PCI ECAM owns this region, nobody else can use it." This is the same way we handle PCI BAR resources. Here are two examples from my laptop. The first (00:08.0) only has one line: it has a BAR that consumes address space, but I don't have a driver for it loaded. The second (00:16.0) does have a driver loaded, so it has a second line showing that the driver owns the space: f124a000-f124afff : 0000:00:08.0 # from PCI core f124d000-f124dfff : 0000:00:16.0 # from PCI core f124d000-f124dfff : mei_me # from mei_me driverquoted
Just removing the error condition and converting dev_err() into dev_info() seems to me like accommodating already shipped BIOS images and flagging a reservation that is already done by somebody else without compromising the functionality of the PCI Root bridge driver (so far the only reason why I can see the error condition there is to catch a buggy MCFG with overlapping addresses; so if this is the case maybe we need to have a different diagnostic check to make sure that the MCFG table is alright)Ideally I think we should end up with this: a0000000-afffffff : pnp 00:01 a0000000-afffffff : PCI ECAM
I think that for PCIe device drivers it works ok because it is guaranteed that their own pci_request_regions() is called always after pci_claim_resource() of the bridge that is on top of them... I.e. pci_claim_resource() reserves the resources as not busy and pci_request_regions() will create a child busy resource
Realistically right now we'll probably end up with only the "PCI ECAM" line in /proc/iomem and a warning from system.c about not being able to reserve the space. If we ever change things to do the generic PNP reservation first, then we should fix things so ecam.c can claim the space without an error.
Maybe the patch below could be a sort of solution...effectively pci_ecam should succeed in reserving a busy resource under the conflict resource in case of PNP driver allocating a non BUSY resource first... --- drivers/pci/ecam.c | 16 +++++----------- drivers/pci/host/pci-thunder-ecam.c | 2 +- include/linux/pci-ecam.h | 2 +- 3 files changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
index 43ed08d..999b6ef 100644
--- a/drivers/pci/ecam.c
+++ b/drivers/pci/ecam.c@@ -66,16 +66,10 @@ struct pci_config_window *pci_ecam_create(struct device *dev, } bsz = 1 << ops->bus_shift; - cfg->res.start = cfgres->start; - cfg->res.end = cfgres->end; - cfg->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY; - cfg->res.name = "PCI ECAM"; - - conflict = request_resource_conflict(&iomem_resource, &cfg->res); - if (conflict) { + cfg->res = request_mem_region(cfgres->start, resource_size(cfgres), "PCI ECAM"); + if (!cfg->res) { err = -EBUSY; - dev_err(dev, "can't claim ECAM area %pR: address conflict with %s %pR\n", - &cfg->res, conflict->name, conflict); + dev_err(dev, "can't claim ECAM area %pR\n", &cfg->res); goto err_exit; }
@@ -126,8 +120,8 @@ void pci_ecam_free(struct pci_config_window *cfg) if (cfg->win) iounmap(cfg->win); } - if (cfg->res.parent) - release_resource(&cfg->res); + if (cfg->res->parent) + release_region(cfg->res->start, resource_size(cfg->res)); kfree(cfg); }
diff --git a/drivers/pci/host/pci-thunder-ecam.c b/drivers/pci/host/pci-thunder-ecam.c
index d50a3dc..2e48d9d 100644
--- a/drivers/pci/host/pci-thunder-ecam.c
+++ b/drivers/pci/host/pci-thunder-ecam.c@@ -117,7 +117,7 @@ static int thunder_ecam_p2_config_read(struct pci_bus *bus, unsigned int devfn, * the config space access window. Since we are working with * the high-order 32 bits, shift everything down by 32 bits. */ - node_bits = (cfg->res.start >> 32) & (1 << 12); + node_bits = (cfg->res->start >> 32) & (1 << 12); v |= node_bits; set_val(v, where, size, val);
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index 7adad20..f30a4ea 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h@@ -36,7 +36,7 @@ struct pci_ecam_ops { * use ECAM. */ struct pci_config_window { - struct resource res; + struct resource *res; struct resource busr; void *priv; struct pci_ecam_ops *ops;
--
2.7.4