Thread (13 messages) 13 messages, 3 authors, 2021-12-15

Re: [PATCH v5 1/2] x86/PCI: Ignore E820 reservations for bridge windows on newer systems

From: Bjorn Helgaas <helgaas@kernel.org>
Date: 2021-12-15 16:02:01
Also in: linux-acpi, linux-pci, lkml

Possibly related (same subject, not in this thread)

On Tue, Dec 07, 2021 at 05:52:40PM +0100, Hans de Goede wrote:
On 11/10/21 14:05, Hans de Goede wrote:
quoted
On 11/10/21 09:45, Hans de Goede wrote:
quoted
On 11/9/21 23:07, Bjorn Helgaas wrote:
quoted
On Sat, Nov 06, 2021 at 11:15:07AM +0100, Hans de Goede wrote:
quoted
On 10/20/21 23:14, Bjorn Helgaas wrote:
quoted
On Wed, Oct 20, 2021 at 12:23:26PM +0200, Hans de Goede wrote:
quoted
On 10/19/21 23:52, Bjorn Helgaas wrote:
quoted
On Thu, Oct 14, 2021 at 08:39:42PM +0200, Hans de Goede wrote:
quoted
Some BIOS-es contain a bug where they add addresses which map to system
RAM in the PCI host bridge window returned by the ACPI _CRS method, see
commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
space").

To work around this bug Linux excludes E820 reserved addresses when
allocating addresses from the PCI host bridge window since 2010.
...
quoted
quoted
I haven't seen anybody else eager to merge this, so I guess I'll stick
my neck out here.

I applied this to my for-linus branch for v5.15.
Thank you, and sorry about the build-errors which the lkp
kernel-test-robot found.

I've just send out a patch which fixes these build-errors
(verified with both .config-s from the lkp reports).
Feel free to squash this into the original patch (or keep
them separate, whatever works for you).
Thanks, I squashed the fix in.

HOWEVER, I think it would be fairly risky to push this into v5.15.
We would be relying on the assumption that current machines have all
fixed the BIOS defect that 4dc2287c1805 addressed, and we have little
evidence for that.

I'm not sure there's significant benefit to having this in v5.15.
Yes, the mainline v5.15 kernel would work on the affected machines,
but I suspect most people with those machines are running distro
kernels, not mainline kernels.
I understand that you were reluctant to add this to 5.15 so close
near the end of the 5.15 cycle, but can we please get this into
5.16 now ?

I know you ultimately want to see if there is a better fix,
but this is hitting a *lot* of users right now and if we come up
with a better fix we can always use that to replace this one
later.
I don't know whether there's a "better" fix, but I do know that if we
merge what we have right now, nobody will be looking for a better
one.

We're in the middle of the merge window, so the v5.16 development
cycle is over.  The v5.17 cycle is just starting, so we have time to
hit that.  Obviously a fix can be backported to older kernels as
needed.
quoted
So can we please just go with this fix now, so that we can
fix the issues a lot of users are seeing caused by the current
*wrong* behavior of taking the e820 reservations into account ?
I think the fix on the table is "ignore E820 for BIOS date >= 2018"
plus the obvious parameters to force it both ways.
Correct.
quoted
The thing I don't like is that this isn't connected at all to the
actual BIOS defect.  We have no indication that current BIOSes have
fixed the defect,
We also have no indication that that defect from 10 years ago, from
pre UEFI firmware is still present in modern day UEFI firmware which
is basically an entire different code-base.

And even 10 years ago the problem was only happening to a single
family of laptop models (Dell Precision laptops) so this clearly
was a bug in that specific implementation and not some generic
issue which is likely to be carried forward.
quoted
and we have no assurance that future ones will not
have the defect.  It would be better if we had some algorithmic way of
figuring out what to do.
You yourself have said that in hindsight taking E820 reservations
into account for PCI bridge host windows was a mistake. So what
the "ignore E820 for BIOS date >= 2018" is doing is letting the
past be the past (without regressing on older models) while fixing
that mistake on any hardware going forward.

In the unlikely case that we hit that BIOS bug again on 1 or 2 models,
we can simply DMI quirk those models, as we do for countless other
BIOS issues.
quoted
Thank you very much for chasing down the dmesg log archive
(https://github.com/linuxhw/Dmesg; see
https://lore.kernel.org/r/82035130-d810-9f0b-259e-61280de1d81f@redhat.com (local)).
Unfortunately I haven't had time to look through it myself, and I
haven't heard of anybody else doing it either.
Right, I'm afraid that I already have spend way too much time on this
myself. Note that I've been working with users on this bug on and off
for over a year now.

This is hitting many users and now that we have a viable fix, this
really needs to be fixed now.

I believe that the "ignore E820 for BIOS date >= 2018" fix is good
enough and that you are letting perfect be the enemy of good here.

As an upstream kernel maintainer myself, I'm sorry to say this,
but if we don't get some fix for this merged soon you are leaving
my no choice but to add my fix to the Fedora kernels as a downstream
patch (and to advise other distros to do the same).

Note that if you are still afraid of regressions going the downstream
route is also an opportunity, Fedora will start testing moving users
to 5.15.y soon, so I could add the patch to Fedora's 5.15.y builds and
see how that goes ?
So I've discussed this with the Fedora kernel maintainers and they have
agreed to add the patch to the Fedora 5.15 kernels, which we will ask
our users to start testing soon (we first run some voluntary testing
before eventually moving all users over).

This will provide us with valuable feedback wrt this patch causing
regressions as you are worried about, or not.

Assuming no regressions show up I hope that this will give you
some assurance that there the patch causes no regressions and that
you will then be willing to pick this up later during the 5.16
cycle so that Fedora only deviates from upstream for 1 cycle.
5.15.y kernels with this patch added have been in Fedora's
stable updates repo for a while now without any reports of the
regressions you feared this may cause.

Bjorn, I hope that you are willing to merge this patch now that it has
seen some more wide spread testing ?
I'm still not happy about the idea of basing this on BIOS dates.  I
did this with 7bc5e3f2be32 ("x86/PCI: use host bridge _CRS info by
default on 2008 and newer machines"), and it was a mistake.

Because of that mistake, we now have the use_crs/nocrs kernel
parameters, which confuse users and lead to them being passed around
as "fixes" on random bulletin boards.

Adding another BIOS date check and use_e820/no_e820 kernel parameters
feels like it's layering on more complexity to cover up another major
mistake I made, 4dc2287c1805 ("x86: avoid E820 regions when allocating
address space").

I think it would be better for the code to recognize the situation
addressed by 4dc2287c1805 and deal with it directly.  Is that
possible?  I dunno; I don't think we've really tried.

Bjorn
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help