RE: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works... | linux-hyperv

[RFC PATCH 00/11] VMBus channel interrupt reassignment · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
Re: [RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 · Andrea Parri <parri.andrea@gmail.com> · 2020-03-28
[RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Andrea Parri <parri.andrea@gmail.com> · 2020-03-26
Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Andrea Parri <parri.andrea@gmail.com> · 2020-03-28
RE: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Michael Kelley <hidden> · 2020-03-29
RE: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-30
Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · Andrea Parri <parri.andrea@gmail.com> · 2020-04-03
[RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Andrea Parri <parri.andrea@gmail.com> · 2020-03-26
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Andrea Parri <parri.andrea@gmail.com> · 2020-03-28
RE: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Michael Kelley <hidden> · 2020-03-29
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-30
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Andrea Parri <parri.andrea@gmail.com> · 2020-04-03
Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-04-03
[RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
Re: [RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel · Stephen Hemminger <stephen@networkplumber.org> · 2020-03-26
Re: [RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel · Andrea Parri <parri.andrea@gmail.com> · 2020-03-26
[RFC PATCH 05/11] hv_utils: Always execute the fcopy and vss callbacks in a tasklet · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 07/11] PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 08/11] Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 09/11] Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
[RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
Re: [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type · Vitaly Kuznetsov <vkuznets@redhat.com> · 2020-03-26
Re: [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type · Andrea Parri <parri.andrea@gmail.com> · 2020-03-28
Re: [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type · Andrea Parri <parri.andrea@gmail.com> · 2020-04-03
[RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> · 2020-03-25
RE: [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · Michael Kelley <hidden> · 2020-03-30
Re: [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · Andrea Parri <parri.andrea@gmail.com> · 2020-03-30
RE: [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · Michael Kelley <hidden> · 2020-03-30
Re: [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · Andrea Parri <parri.andrea@gmail.com> · 2020-04-03

RE: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: 2020-03-30 12:24:25
Also in: lkml

Michael Kelley [off-list ref] writes:

From: Andrea Parri <parri.andrea@gmail.com> Sent: Saturday, March 28, 2020 10:09 AM

quoted

In case we believe that OFFER -> RESCINF sequence is always ordered
by the host AND we don't care about other offers in the queue the
suggested locking is OK: we're guaranteed to process RESCIND after we
finished processing OFFER for the same channel. However, waiting for
'offer_in_progress == 0' looks fishy so I'd suggest we at least add a
comment explaining that the wait is only needed to serialize us with
possible OFFER for the same channel - and nothing else. I'd personally
still slightly prefer the algorythm I suggested as it guarantees we take
channel_mutex with offer_in_progress == 0 -- even if there are no issues
we can think of today (not strongly though).

Does it?  offer_in_progress is incremented without channel_mutex...

No, it does not, you're right, by itself the change is insufficient.

quoted

IAC, I have no objections to apply the changes you suggested.  To avoid
misunderstandings: vmbus_bus_suspend() presents a similar usage...  Are
you suggesting that I apply similar changes there?

Alternatively:  FWIW, the comment in vmbus_onoffer_rescind() does refer
to "The offer msg and the corresponding rescind msg...".  I am all ears
if you have any concrete suggestions to improve these comments.

Given that waiting for 'offer_in_progress == 0' is the current code, I think
there's an argument to made for not changing it if the change isn't strictly
necessary.  This patch set introduces enough change that *is* necessary. :-)

Sure. I was thinking a bit more about this and it seems that over years
we've made the synchronization of channels code too complex (every time
for a good reason but still). Now (before this series) we have at least:

vmbus_connection.channel_mutex
vmbus_connection.offer_in_progress
channel.probe_done
channel.rescind
Workqueues (vmbus_connection.work_queue,
 queue_work_on(vmbus_connection.connect_cpu),...)
channel.lock spinlock (the least of the problems)

Maybe there's room for improvement? Out of top of my head I'd suggest a
state machine for each channel (e.g something like
OFFERED->OPENING->OPEN->RESCIND_REQ->RESCINDED->CLOSED) + refcounting
(subchannels, open/rescind/... requests in progress, ...) + non-blocking
request handling like "Can we handle this rescind offer now? No,
refcount is too big. OK, rescheduling the work". Maybe not the best
design ever and I'd gladly support any other which improves the
readability of the code and makes all state changes and synchronization
between them more obvious.

Note, VMBus channel handling driven my messages (unlike events for ring
buffer) is not performance critical, we just need to ensure completeness
(all requests are handled correctly) with forward progress guarantees
(no deadlocks).

I understand the absence of 'hot' issues in the current code is what can
make the virtue of redesign questionable and sorry for hijacking the
series which doesn't seem to make things worse :-)

-- 
Vitaly

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help