Thread (40 messages) 40 messages, 5 authors, 2019-10-16

Re: [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device()

From: Marta Rybczynska <hidden>
Date: 2019-08-22 17:07:35
Also in: linux-pci


----- On 16 Aug, 2019, at 18:50, Sergey Miroshnichenko s.miroshnichenko@yadro.com wrote:
quoted hunk ↗ jump to hunk
This is a yet another approach to fix an old [1-2] concurrency issue, when:
- two or more devices are being hot-added into a bridge which was
  initially empty;
- a bridge with two or more devices is being hot-added;
- during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges.

The problem is that a bridge is reported as enabled before the MEM/IO bits
are actually written to the PCI_COMMAND register, so another driver thread
starts memory requests through the not-yet-enabled bridge:

CPU0                                        CPU1

pci_enable_device_mem()                     pci_enable_device_mem()
  pci_enable_bridge()                         pci_enable_bridge()
    pci_is_enabled()
      return false;
    atomic_inc_return(enable_cnt)
    Start actual enabling the bridge
    ...                                         pci_is_enabled()
    ...                                           return true;
    ...                                     Start memory requests <-- FAIL
    ...
    Set the PCI_COMMAND_MEMORY bit <-- Must wait for this

Protect the pci_enable/disable_device() and pci_enable_bridge(), which is
similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race
while enabling upstream bridges"), but adding a per-device mutexes and
preventing the dev->enable_cnt from from incrementing early.

CC: Srinath Mannam <redacted>
CC: Marta Rybczynska <redacted>
Signed-off-by: Sergey Miroshnichenko <redacted>

[1]
https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u (local)
   [RFC PATCH v3] pci: Concurrency issue during pci enable bridge

[2]
https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u (local)
   [RFC PATCH] nvme: avoid race-conditions when enabling devices
---
drivers/pci/pci.c   | 26 ++++++++++++++++++++++----
drivers/pci/probe.c |  1 +
include/linux/pci.h |  1 +
3 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 1b27b5af3d55..e7f8c354e644 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1645,6 +1645,8 @@ static void pci_enable_bridge(struct pci_dev *dev)
	struct pci_dev *bridge;
	int retval;

+	mutex_lock(&dev->enable_mutex);
+
	bridge = pci_upstream_bridge(dev);
	if (bridge)
		pci_enable_bridge(bridge);
@@ -1652,6 +1654,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
	if (pci_is_enabled(dev)) {
		if (!dev->is_busmaster)
			pci_set_master(dev);
+		mutex_unlock(&dev->enable_mutex);
		return;
	}
This code is used by numerous drivers and when we've seen that issue I was wondering
if there are some use-cases when this (or pci_disable_device) is called with interrupts
disabled. It seems that it shouldn't be, but a BUG_ON or error when someone calls
it this way would be helpful when debugging.

Marta
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help