Thread (7 messages) 7 messages, 3 authors, 2018-08-14

Re: [PATCH RFC net-next] openvswitch: Queue upcalls to userspace in per-port round-robin order

From: Stefano Brivio <hidden>
Date: 2018-08-03 18:49:57

Possibly related (same subject, not in this thread)

Hi Ben,

On Tue, 31 Jul 2018 15:06:57 -0700
Ben Pfaff [off-list ref] wrote:
This is an awkward problem to try to solve with sockets because of the
nature of sockets, which are strictly first-in first-out.  What you
really want is something closer to the algorithm that we use in
ovs-vswitchd to send packets to an OpenFlow controller.  When the
channel becomes congested, then for each packet to be sent to the
controller, OVS appends it to a queue associated with its input port.
(This could be done on a more granular basis than just port.)  If the
maximum amount of queued packets is reached, then OVS discards a packet
from the longest queue.  When space becomes available in the channel,
OVS round-robins through the queues to send a packet.  This achieves
pretty good fairness but it can't be done with sockets because you can't
drop a packet that is already queued to one.
Thanks for your feedback. What you describe is, though, functionally
equivalent to what this patch does, minus the per-port queueing limit.

However, instead of having one explicit queue for each port, and
then fetching packets in a round-robin fashion from all the queues, we
implemented this with a single queue and choose insertion points while
queueing in such a way that the result is equivalent. This way, we
avoid the massive overhead associated with having one queue per each
port (we can have up to 2^16 ports), and cycling over them.

Let's say we have two ports, A and B, and three upcalls are sent for
each port. If we implement one queue for each port as you described, we
end up with this:

.---------------- - - -
| A1 | A2 | A3 |
'---------------- - - -

.---------------- - - -
| B1 | B2 | B3 |
'---------------- - - -

and then send upcalls in this order: A1, B1, A2, B2, A3, B3.

What we are doing here with a single queue is inserting the upcalls
directly in this order:

.------------------------------- - - -
| A1 | B1 | A2 | B2 | A3 | B3 |
'------------------------------- - - -

and dequeueing from the head.

About the per-port queueing limit: we currently have a global one
(UPCALL_QUEUE_MAX_LEN), while the per-port limit is simply given by
implementation constraints in our case:

		if (dp->upcalls.count[pos->port_no] == U8_MAX - 1) {
			err = -ENOSPC;
			goto out_clear;
		}

but we can easily swap that U8_MAX - 1 with another macro or a
configurable value, if there's any value in doing that.
My current thought is that any fairness scheme we implement directly in
the kernel is going to need to evolve over time.  Maybe we could do
something flexible with BPF and maps, instead of hard-coding it.
Honestly, I fail to see what else we might want to do here, other than
adding a simple mechanism for fairness, to solve the specific issue at
hand. Flexibility would probably come at a higher cost. We could easily
make limits configurable if needed. Do you have anything else in mind?

-- 
Stefano
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help