Thread (35 messages) 35 messages, 4 authors, 2022-01-11

Re: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state

From: Cornelia Huck <cohuck@redhat.com>
Date: 2021-12-14 12:08:57
Also in: kvm

On Mon, Dec 13 2021, Alex Williamson [off-list ref] wrote:
We do specify that a migration driver has discretion in using the error
state for failed transitions, so there are options to simplify error
handling.

If we look at bit flips, we have:

Initial state
|  Resuming
|  |  Saving
|  |  |  Running
|  |  |  |  Next states with multiple bit flips

a) 0, 0, 0  (d)
b) 0, 0, 1  (c) (e)
c) 0, 1, 0  (b) (e)
d) 0, 1, 1  (a) (e)
e) 1, 0, 0  (b) (c) (d)
f) 1, 0, 1 UNSUPPORTED
g) 1, 1, 0 ERROR
h) 1, 1, 1 INVALID

We specify that we cannot pass through any invalid states during
transition, so if I consider each bit to be a discrete operation and
map all these multi-bit changes to a sequence of single bit flips, the
only requirements are:

  1) RESUMING must be cleared before setting SAVING or RUNNING
  2) SAVING or RUNNING must be cleared before setting RESUMING

I think the basis of your priority scheme comes from that.  Ordering
of the remaining items is more subtle though, for instance
0 -> SAVING | RUNNING can be broken down as:

  0 -> SAVING
  SAVING -> SAVING | RUNNING 

  vs

  0 -> RUNNING
  RUNNING -> SAVING | RUNNING

I'd give preference to enabling logging before running and I believe
that holds for transition (e) -> (d) as well.
Agreed.
In the reverse case, SAVING | RUNNING -> 0

  SAVING | RUNNING -> RUNNING
  RUNNING -> 0

  vs

  SAVING | RUNNING -> SAVING
  SAVING -> 0

This one is more arbitrary.  I tend to favor clearing RUNNING to stop
the device first, largely because it creates nice symmetry in the
resulting algorithm and follows the general principle that you
discovered as well, converge towards zero by addressing bit clearing
before setting.
That also makes sense to me.
So a valid algorithm would include:

int set_device_state(u32 old, u32 new, bool is_unwind)
{
	if (old.RUNNING && !new.RUNNING) {
		curr.RUNNING = 0;
		if (ERROR) goto unwind;
	}
	if (old.SAVING && !new.SAVING) {
		curr.SAVING = 0;
		if (ERROR) goto unwind;
	}
	if (old.RESUMING && !new.RESUMING) {
		curr.RESUMING = 0;
		if (ERROR) goto unwind;
	}
	if (!old.RESUMING && new.RESUMING) {
		curr.RESUMING = 1;
		if (ERROR) goto unwind;
	}
	if (!old.SAVING && new.SAVING) {
		curr.saving = 1;
		if (ERROR) goto unwind;
	}
	if (!old.RUNNING && new.RUNNING) {
		curr.RUNNING = 1;
		if (ERROR) goto unwind;
        }

	return 0;

unwind:
	if (!is_unwind) {
		ret = set_device_state(curr, old, true);
		if (ret) {
			curr.raw = ERROR;
			return ret;
		}
	}

	return -EIO;
}
I've stared at this and scribbled a bit on paper and I think this would
work.
And I think that also addresses the claim that we're doomed to untested
and complicated error code handling, we unwind by simply swapping the
args to our set state function and enter the ERROR state should that
recursive call fail.
Nod. As we clear RUNNING as the first thing and would only set RUNNING
again as the last step, any transition to ERROR should be save.
I think it would be reasonable for documentation to present similar
pseudo code as a recommendation, but ultimately the migration driver
needs to come up with something that fits all the requirements.
Yes, something like this should go into Documentation/.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help