Thread (27 messages) 27 messages, 5 authors, 2016-03-15

[RFC PATCH v3 3/3] PCI/ACPI: hisi: Add ACPI support for HiSilicon SoCs Host Controllers

From: Gabriele Paoloni <hidden>
Date: 2016-03-15 11:13:58
Also in: linux-acpi, linux-pci, lkml

Hi Bjorn, many thanks for coming back on this
-----Original Message-----
From: Bjorn Helgaas [mailto:helgaas at kernel.org]
Sent: 14 March 2016 19:17
To: Gabriele Paoloni
Cc: Lorenzo Pieralisi; 'Mark Rutland'; Guohanjun (Hanjun Guo); Wangzhou
(B); liudongdong (C); Linuxarm; qiujiang; 'bhelgaas at google.com';
'arnd at arndb.de'; 'tn at semihalf.com'; 'linux-pci at vger.kernel.org';
'linux-kernel at vger.kernel.org'; xuwei (O); 'linux-
acpi at vger.kernel.org'; 'jcm at redhat.com'; zhangjukuo; Liguozhu
(Kenneth); 'linux-arm-kernel at lists.infradead.org'
Subject: Re: [RFC PATCH v3 3/3] PCI/ACPI: hisi: Add ACPI support for
HiSilicon SoCs Host Controllers

On Wed, Mar 09, 2016 at 07:41:01AM +0000, Gabriele Paoloni wrote:
quoted
Hi Bjorn, Lorenzo
quoted
-----Original Message-----
From: Bjorn Helgaas [mailto:helgaas at kernel.org]
Sent: 02 March 2016 15:51
To: Lorenzo Pieralisi
Cc: Gabriele Paoloni; 'Mark Rutland'; Guohanjun (Hanjun Guo);
Wangzhou
quoted
quoted
(B); liudongdong (C); Linuxarm; qiujiang; 'bhelgaas at google.com';
'arnd at arndb.de'; 'tn at semihalf.com'; 'linux-pci at vger.kernel.org';
'linux-kernel at vger.kernel.org'; xuwei (O); 'linux-
acpi at vger.kernel.org'; 'jcm at redhat.com'; zhangjukuo; Liguozhu
(Kenneth); 'linux-arm-kernel at lists.infradead.org'
Subject: Re: [RFC PATCH v3 3/3] PCI/ACPI: hisi: Add ACPI support
for
quoted
quoted
HiSilicon SoCs Host Controllers

On Tue, Mar 01, 2016 at 07:22:47PM +0000, Lorenzo Pieralisi wrote:
quoted
Hi Bjorn,

On Thu, Feb 25, 2016 at 01:59:12PM -0600, Bjorn Helgaas wrote:
quoted
On Thu, Feb 25, 2016 at 12:07:50PM +0000, Lorenzo Pieralisi
wrote:
quoted
quoted
quoted
quoted
quoted
On Thu, Feb 25, 2016 at 03:01:19AM +0000, Gabriele Paoloni
wrote:
quoted
quoted
quoted
[...]
quoted
quoted
I do not understand how PNP0c02 works, currently, by the way.

If I read x86 code correctly, the unassigned PCI bus
resources
quoted
quoted
are
quoted
quoted
quoted
assigned in arch/x86/pci/i386.c (?)
fs_initcall(pcibios_assign_resources),
quoted
quoted
quoted
with a comment:

/**
 * called in fs_initcall (one below subsys_initcall),
 * give a chance for motherboard reserve resources
 */

Problem is, motherboard resources are requested through (?):

drivers/pnp/system.c

which is also initialized at fs_initcall, so it might be
called
quoted
quoted
after
quoted
quoted
quoted
core x86 code reassign resources, defeating the purpose
PNP0c02
quoted
quoted
was
quoted
quoted
quoted
designed for, namely, request motherboard regions before
resources
quoted
quoted
quoted
are assigned, am I wrong ?
I think you're right.  This is a long-standing screwup in
Linux.
quoted
quoted
quoted
quoted
IMHO, ACPI resources should be parsed and reserved by the ACPI
core,
quoted
quoted
before any PCI resource management (since PCI host bridges are
represented in ACPI).  But historically PCI devices have
enumerated
quoted
quoted
quoted
quoted
before ACPI got involved.  And the ACPI core doesn't really pay
attention to _CRS for most devices (with the exception of
PNP0C02).
quoted
quoted
quoted
quoted
IMO the PNP0C02 code in drivers/pnp/system.c should really be
done
quoted
quoted
in
quoted
quoted
the ACPI core for all ACPI devices, similar to the way the PCI
core
quoted
quoted
quoted
quoted
reserves BAR space for all PCI devices, even if we don't have
drivers
quoted
quoted
for them.  I've tried to fix this in the past, but it is really
a
quoted
quoted
quoted
quoted
nightmare to unravel everything.

Because the ACPI core doesn't reserve resources for the _CRS of
all
quoted
quoted
quoted
quoted
ACPI devices, we're already vulnerable to the problem of
placing a
quoted
quoted
quoted
quoted
device on top of another ACPI device.  We don't see problems
because
quoted
quoted
on x86, at least, most ACPI devices are already configured by
the
quoted
quoted
BIOS
quoted
quoted
to be enabled and non-overlapping.  But x86 has the advantage
of
quoted
quoted
quoted
quoted
having extensive test coverage courtesy of Windows, and as long
as
quoted
quoted
quoted
quoted
_CRS has the right stuff in it, we at least have the potential
of
quoted
quoted
quoted
quoted
fixing problems in Linux.
...
By "fixing problems in Linux" above, you mean that, given that we
do have a validated _CRS space, we can request/reserve the region
the
quoted
quoted
_CRS
quoted
reports to prevent assigning those resources to other devices,
correct ?

Yes.

I think part of what makes this difficult in Linux is that the
resource tree is too strict about overlapping resources.  We get
address space information from E820 (on x86), static ACPI tables
like
quoted
quoted
MCFG, and dynamic things like ACPI _CRS.  There's no real
requirement
quoted
quoted
that the BIOS should make all these consistent, but yet we try to
jam
quoted
quoted
it all into the same resource tree.

For example, E820 might tell us that range A is reserved and
unavailable to Linux.  We stick it in the resource tree.  Then we
might have a _CRS that tells us about range B.  We *want* to put
range
quoted
quoted
B in the resource tree, but if B overlaps part of A, the insert
will
quoted
quoted
fail.

All we really need from E820 is the information that "you can't put
devices in A".  We don't need to enforce any relationship between A
and B, but the current resource tree imposes unnecessary
hierarchical
quoted
quoted
requirements.

I think issues like this are the biggest reason why the ACPI core
doesn't reserve all _CRS space early on (Rafael may correct me
here).
quoted
quoted
quoted
quoted
If the platform doesn't report resource usage correctly on ARM,
we
quoted
quoted
may
quoted
quoted
not find problems (because we don't have the Windows test
suite)
quoted
quoted
and
quoted
quoted
if we have resource assignment problems because _CRS is
lacking,
quoted
quoted
we'll
quoted
quoted
have no way to fix them.
And I think here you mean we can't prevent assigning resource
space
quoted
quoted
to
quoted
devices that do not necessarily own it because since some devices
_CRS
quoted
are borked/missing we have no way to detect the address space
allocated
quoted
to them and we may end up with resources conflicts.
The ACPI core currently doesn't reserve the space consumed by ACPI
devices.  Some drivers, e.g., for PNP0C02 (motherboard) and PNP0A03
(PCI host bridge), do reserve their space, but the core itself does
not.

If we have drivers for all the ACPI devices, those drivers will
probably use _CRS and reserve the space, and we'll probably notice
any
quoted
quoted
_CRS errors.  But if we don't have drivers, e.g., for performance
monitors or other non-essential things, nothing will use _CRS, and
nothing will reserve the space used by the device, and it's hard to
find errors.

If we ever assign top-level resources, there's nothing to prevent
us
quoted
quoted
from creating a conflict.  The only reason we don't trip over this
is
quoted
quoted
that we usually don't assign top-level resources because firmware
does
quoted
quoted
it for us.
It seems that in this thread we have touched quite few issues.

First is how to describe a PCI host controller config space resource:
as you highlighted before mentioning the specs, these resources
should
quoted
be marked as "PNP0C02"; therefore I guess the current Nowicki
patchset
quoted
must be reworked to check the resources to be motherboard reserved
when
quoted
parsing the MCFG table.
I think checking whether MCFG resources are reserved by motherboard
("PNP0C02") devices is a hack that was added on x86 because there were
issues getting ECAM to work reliably.  The theory at the time was that
the problem was BIOS bugs.  I don't know whether that's actually true
or not.

I'm not convinced that checking for PNP0C02 resources is something we
should do in generic code.  MCFG is a static table, and I don't think
we should add dependencies on the ACPI namespace, because the point of
static tables is to describe things we might need before the namespace
is available.
Ok fine, then the current Nowicki patchset is good from this perspective
quoted
Also with respect to the ACPI table for my specific PCIe controller I
would use the following approach:

	// PCIe Root bus
	Device (PCI1)
	{
		Name (_HID, "HISI0080") // PCI Express Root Bridge
		Name (_CID, "PNP0A03") // Compatible PCI Root Bridge
		Name(_SEG, 1) // Segment of this Root complex
		Name(_BBN, 0) // Base Bus Number
		Name(_CCA, 1)
		Method (_CRS, 0, Serialized) { // Root complex resources
			Name (RBUF, ResourceTemplate () {
				WordBusNumber ( // Bus numbers assigned to this
root
quoted
					ResourceProducer, MinFixed, MaxFixed,
PosDecode,
quoted
					0, // AddressGranularity
					0, // AddressMinimum - Minimum Bus Number
					63, // AddressMaximum - Maximum Bus
Number
quoted
					0, // AddressTranslation - Set to 0
					64 // RangeLength - Number of Busses
				)
				QWordMemory ( // 64-bit BAR Windows
					ResourceProducer,
					PosDecode,
					MinFixed,
					MaxFixed,
					Cacheable,
					ReadWrite,
					0x0000000000000000, // Granularity
					0x00000000b0000000, // Min Base Address
pci address
quoted
					0x00000000bbfeffff, // Max Base Address
					0x0000021f54000000, // Translate
					0x000000000bff0000 // Length
				)
				QWordIO (
					ResourceProducer,
					MinFixed,
					MaxFixed,
					PosDecode,
					EntireRange,
					0x0000000000000000, // Granularity
					0x0000000000000000, // Min Base Address
					0x000000000000ffff, // Max Base Address
					0x000002200fff0000, // Translate
					0x0000000000010000 // Length
				)
			}) // Name(RBUF)
			Return (RBUF)
		} // Method(_CRS)
		Device (RES0)
		{
			Name (_HID, "HISI0081") // HiSi PCIe RC config base
address
quoted
			Name (_CID, "PNP0C02")  // Motherboard reserved
resource
quoted
			Name (_CRS, ResourceTemplate (){
                         Memory32Fixed (ReadWrite, 0xb0080000 ,
0x10000)
quoted
			})

		}

So in the table above I have a sub-device under the RC to pass the
address
quoted
for the RC config space (the rest of the config space addresses for
bus 1
quoted
to 63 are passed in the MCFG). As you can see the device _CID is
PNP0C02
quoted
As per ACPI specs.
Do you see anything wrong with this approach?
It looks OK to me.  The PCI Firmware spec r3.0, sec 4.1.2, does say
that the motherboard resource would usually appear at the root of the
namespace (under \_SB).  That's not an absolute statement, but I don't
know why it would be included at all unless the authors thought it was
important for some reason.
Ok great so I'll start reworking my ACPI based quirk according to the
table above

Many Thanks

Gab
quoted
The second issue is when and how to reserve HW resources. As per
conversation above this seems quite a tricky issue and probably needs
to
quoted
consider different aspects...
I don't think resources should be reserved based on MCFG.  Maybe we
need to reserve MCFG areas on x86 for legacy reasons, but I don't
think we should do it on arm64.
quoted
I was wondering if we can take a gradual approach; maybe for the time
being
quoted
we can just rework Nowicki patchset to check the MCFG resources to be
motherboard reserved and later on we can make an effort to fix the
resource
quoted
insertion mechanism making sure that it works right on both x86 and
ARM.
quoted
What do you think about?

Many Thanks

Gab
quoted
quoted
quoted
quoted
As per last Tomasz's patchset, we claim and assign unassigned
PCI
quoted
quoted
quoted
quoted
quoted
resources upon ACPI PCI host bridge probing (which happens at
subsys_initcall time, courtesy of ACPI current code); at that
time the
quoted
quoted
quoted
kernel did not even register the PNP0c02 driver
(drivers/pnp/system.c)
quoted
quoted
quoted
(it does that at fs_initcall). On the other hand, we insert
MCFG
quoted
quoted
quoted
quoted
quoted
regions into the resource tree upon MCFG parsing, so I do not
see why we need to rely on PNP0c02 to do that for us
(granted,
quoted
quoted
the
quoted
quoted
quoted
mechanism is part of the PCI fw specs, which are x86 centric
anyway
quoted
quoted
quoted
ie we can't certainly rely on Int15 e820 to detect reserved
memory
quoted
quoted
quoted
on ARM :D)

There is lots of legacy x86 here and Bjorn definitely has
more
quoted
quoted
quoted
quoted
quoted
visibility into that than I have, the ARM world must
understand
quoted
quoted
quoted
quoted
quoted
how this works to make sure we have an agreement.
As you say, there is lots of unpleasant x86 legacy here.
Possibly
quoted
quoted
ARM
quoted
quoted
has a chance to clean this up and do it more sanely; I'm not
sure
quoted
quoted
quoted
quoted
whether it's feasible to reverse the ACPI/PCI init order there
or
quoted
quoted
not.
quoted
quoted
Rafael, any thoughts on this whole thing?

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-
acpi"
quoted
quoted
in
quoted
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-
info.html
quoted
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi"
in
quoted
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help