Thread (7 messages) 7 messages, 4 authors, 2021-04-22

RE: [PATCH] PCI: hv: Fix a race condition when removing the device

From: Dexuan Cui <decui@microsoft.com>
Date: 2021-04-22 02:31:32
Also in: linux-pci, lkml

From: Michael Kelley <redacted>
Sent: Wednesday, April 21, 2021 2:06 PM
 ...
quoted
Yes I think put_hvpcibus() and get_hvpcibus() can be removed, as we have
changed to use
a dedicated workqueue for hbus since they were introduced.

But we still need to call tasklet_disable/enable() the same way
hv_pci_suspend() does, the
reason is that we need to protect hbus->state. This value needs to be
consistent for the
quoted
driver. For example, a CPU may decide to schedule a work on a work queue
that we just
quoted
flushed or destroyed, by reading the wrong hbus->state.
Yes, I would agree the tasklet disable/enable are needed, especially since
tasklet_disable()
is what ensures that the tasklet is not currently running.

If the hbus ref counting isn't needed any longer, I would strongly recommend
adding
a patch to the series that removes it.  This synchronization stuff is hard
enough to
understand and reason about; having a leftover mechanism that doesn't really
do
anything useful makes it nearly impossible. :-)

Dexuan -- I'm hoping you can take a look as well and see if you agree.

Michael
I also think we can remove the reference counting.

But it looks like there is still race in hv_pci_remove() even with Long's
patch: in hv_pci_remove(), we disable the tasklet, change hbus->state to
hv_pcibus_removing, re-enable the tasklet and flush hbus->wq, and set
hbus->state to hv_pcibus_removed -- what if the channel callback runs
again? -- now hbus->state is no longer hv_pcibus_removing, so
hv_pci_devices_present() -> hv_pci_start_relations_work() and
hv_pci_eject_device() can still add new work items to hbus->wq, and the new
work items may race with the vmbus_close().

It looks like we should remove the state hv_pcibus_removed?

Thanks,
-- Dexuan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help