Thread (36 messages) 36 messages, 10 authors, 2017-07-07

[PATCH v9 2/3] PCI: Add tango PCIe host bridge support

From: Ard Biesheuvel <hidden>
Date: 2017-07-03 18:44:36
Also in: linux-pci, lkml

On 3 July 2017 at 19:11, Russell King - ARM Linux [off-list ref] wrote:
On Mon, Jul 03, 2017 at 08:40:31AM -0500, Bjorn Helgaas wrote:
quoted
The problem is serializing vs. memory accesses, since they don't use
any wrappers.  However, they are ioremapped(), so it's at least
conceivable that another solution would be to use VM to trap those
accesses.  I'm not a VM person, so I don't know whether that's
feasible in Linux.
Bjorn,

You're forgetting that MMIO (iow, memory returned by ioremap()) must
be accessed through the appropriate accessors, and must not be
directly dereferenced in C.  (We do have buggy drivers that do that
but they are buggy, and in many cases are getting attention to fix
that.)

However, adding a spinlock into them is really not nice, because it
adds extra overhead that's only necessary for rare cases like Sigma
Designs - especially when you consider that these accessors are used
for all MMIO accesses, not just PCI.  It would effectively mean that
we end up serialising all MMIO accesses throughout the kernel when
Sigma Designs SoCs are enabled, destroying some of the SMP benefit.

I don't think we can sanely use the MMU to trap those accesses either,
that would mean sending IPIs to tell other CPUs to do something, and
waiting for them to respond - which can deadlock if we're already in
an IRQ-protected region (iirc, config accesses are made with IRQs
off.)

I don't think there's an easy solution to this problem - and I'm not
sure that stop_machine() can be made to work in this path (which
needs a process context).  I have a suspicion that the Sigma Designs
PCI implementation is just soo insane that it's never going to work
reliably in a multi-SoC kernel without introducing severe performance
issues for everyone else.
I suppose we could perhaps use per-cpu spinlocks? That would put the
complexity in the Sigma config space accessors, i.e., to take each
lock before proceeding with reprogramming the outbound window, and
other implementations wouldn't have to care. However, I do agree with
Russell that having this complexity in the first place is hard to
justify if the only implementation that requires it is a wacky design
that needs lots of other quirks to operate somewhat sanely to begin
with.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help