Thread (26 messages) 26 messages, 6 authors, 2021-09-28

Re: Redux: Backwards compatibility for XDP multi-buff

From: Zvi Effron <hidden>
Date: 2021-09-21 23:10:27
Also in: netdev

On Tue, Sep 21, 2021 at 3:14 PM Toke Høiland-Jørgensen [off-list ref] wrote:
Zvi Effron [off-list ref] writes:
quoted
On Tue, Sep 21, 2021 at 11:23 AM Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
Zvi Effron [off-list ref] writes:
quoted
On Tue, Sep 21, 2021 at 9:06 AM Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
Hi Lorenz (Cc. the other people who participated in today's discussion)

Following our discussion at the LPC session today, I dug up my previous
summary of the issue and some possible solutions[0]. Seems no on
actually replied last time, which is why we went with the "do nothing"
approach, I suppose. I'm including the full text of the original email
below; please take a look, and let's see if we can converge on a
consensus here.

First off, a problem description: If an existing XDP program is exposed
to an xdp_buff that is really a multi-buffer, while it will continue to
run, it may end up with subtle and hard-to-debug bugs: If it's parsing
the packet it'll only see part of the payload and not be aware of that
fact, and if it's calculating the packet length, that will also only be
wrong (only counting the first fragment).

So what to do about this? First of all, to do anything about it, XDP
programs need to be able to declare themselves "multi-buffer aware" (but
see point 1 below). We could try to auto-detect it in the verifier by
which helpers the program is using, but since existing programs could be
perfectly happy to just keep running, it probably needs to be something
the program communicates explicitly. One option is to use the
expected_attach_type to encode this; programs can then declare it in the
source by section name, or the userspace loader can set the type for
existing programs if needed.

With this, the kernel will know if a given XDP program is multi-buff
aware and can decide what to do with that information. For this we came
up with basically three options:

1. Do nothing. This would make it up to users / sysadmins to avoid
   anything breaking by manually making sure to not enable multi-buffer
   support while loading any XDP programs that will malfunction if
   presented with an mb frame. This will probably break in interesting
   ways, but it's nice and simple from an implementation PoV. With this
   we don't need the declaration discussed above either.

2. Add a check at runtime and drop the frames if they are mb-enabled and
   the program doesn't understand it. This is relatively simple to
   implement, but it also makes for difficult-to-understand issues (why
   are my packets suddenly being dropped?), and it will incur runtime
   overhead.

3. Reject loading of programs that are not MB-aware when running in an
   MB-enabled mode. This would make things break in more obvious ways,
   and still allow a userspace loader to declare a program "MB-aware" to
   force it to run if necessary. The problem then becomes at what level
   to block this?
I think there's another potential problem with this as well: what happens to
already loaded programs that are not MB-aware? Are they forcibly unloaded?
I'd say probably the opposite: You can't toggle whatever switch we end
up with if there are any non-MB-aware programs (you'd have to unload
them first)...
How would we communicate that issue? dmesg? I'm not very familiar with
how sysctl change failure causes are communicated to users, so this
might be a solved problem, but if I run `sysctl -w net.xdp.multibuffer
1` (or whatever ends up actually being the toggle) to active
multi-buffer, and it fails because there's a loaded non-aware program,
that seems like a potential for a lot of administrator pain.
Hmm, good question. Document that this only fails if there's a
non-mb-aware XDP program loaded? Or use some other mechanism with better
feedback?
quoted
quoted
quoted
quoted
   Doing this at the driver level is not enough: while a particular
   driver knows if it's running in multi-buff mode, we can't know for
   sure if a particular XDP program is multi-buff aware at attach time:
   it could be tail-calling other programs, or redirecting packets to
   another interface where it will be processed by a non-MB aware
   program.

   So another option is to make it a global toggle: e.g., create a new
   sysctl to enable multi-buffer. If this is set, reject loading any XDP
   program that doesn't support multi-buffer mode, and if it's unset,
   disable multi-buffer mode in all drivers. This will make it explicit
   when the multi-buffer mode is used, and prevent any accidental subtle
   malfunction of existing XDP programs. The drawback is that it's a
   mode switch, so more configuration complexity.
Could we combine the last two bits here into a global toggle that doesn't
require a sysctl? If any driver is put into multi-buffer mode, then the system
switches to requiring all programs be multi-buffer? When the last multi-buffer
enabled driver switches out of multi-buffer, remove the system-wide
restriction?
Well, the trouble here is that we don't necessarily have an explicit
"multi-buf mode" for devices. For instance, you could raise the MTU of a
device without it necessarily involving any XDP multi-buffer stuff (if
you're not running XDP on that device). So if we did turn "raising the
MTU" into such a mode switch, we would end up blocking any MTU changes
if any XDP programs are loaded. Or having an MTU change cause a
force-unload of all XDP programs.
Maybe I missed something then, but you had stated that "while a
particular driver knows if it's running in multi-buff mode" so I
assumed that the driver would be able to tell when to toggle the mode
on.
Well, a driver knows when it is attaching an XDP program whether it (the
driver) is configured in a way such that this XDP program could
encounter a multi-buf.
I know drivers sometimes reconfigure themselves when an XDP program is
attached, but is there any information provided by the attach (other than that
an XDP program is attaching) that they use to make configuration decisions
during that reconfiguration?

Without modifying the driver to intentionally configure itself differently
based on whether or not the program is mb-aware (which is believe is currently
not the case for any driver), won't the configuration of a driver be identical
post XDP attach regardless of whether or not the program is mb-aware or not?

I was thinking the driver would make it's mb-aware determination (and refcount
adjustments) when its configuration changes for any reason that could
potentially affect mb-aware status (mostly MTU adjustments, I suspect).
quoted
I had been thinking that when a driver turned multi-buffer off, it
could trigger a check of all drivers, but that also seems like it
could just be a global refcount of all the drivers that have requested
multi-buffer mode. When a driver enables multi-buffer for itself, it
increments the refcount, and when it disables, it decrements. A
non-zero count means the system is in multi-buffer mode.
I guess we could do a refcount-type thing when an multi-buf XDP program
is first attached (as per above). But I think it may be easier to just
do it at load-time, then, so it doesn't have to be in the driver, but
the BPF core could just enforce it.

This would basically amount to a rule saying "you can't mix mb-aware and
non-mb-aware programs", and the first type to be loaded determines which
mode the system is in. This would be fairly simple to implement and
enforce, I suppose. The drawback is that it's potentially racy in the
order programs are loaded...
Accepting or rejecting at load time would definitely simplify things a bit. But
I think the raciness is worse than just based on the first program to load. If
we're doing refcounting at attach/detach time, then I can load an mb-aware and
an mb-unaware program before attaching anything. What do I do when I attach one
of them? The other would be in violation.

If instead of making the determination at attach time, we make it at load time,
I think it'd be better to go back to the sysctl controlling it, and simply not
allow changing the sysctl if any XDP program at all is loaded, as opposed to
if a non-aware program is installed.

Then we're back to the sysctl controlling whether or not mb-aware is required.
We stil have a communication to the administrator problem, but it's simplified
a bit from "some loaded program doesn't comply" and having to track down which
one to "there is an XDP program installed".
-Toke
Side note: how do extension programs fit into this? An extension program that's
going to freplace a function in an XDP program (that receives the context)
would also need to mb-aware or not, but not all extension programs can attach
to such functions, and we wouldn't want those programs to be impacted. Is this
as simple as marking every XDP program and every extension program that takes
an XDP context parameter as needing to be marked as mb-aware?

--Zvi

On Tue, Sep 21, 2021 at 3:14 PM Toke Høiland-Jørgensen [off-list ref] wrote:
Zvi Effron [off-list ref] writes:
quoted
On Tue, Sep 21, 2021 at 11:23 AM Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
Zvi Effron [off-list ref] writes:
quoted
On Tue, Sep 21, 2021 at 9:06 AM Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
Hi Lorenz (Cc. the other people who participated in today's discussion)

Following our discussion at the LPC session today, I dug up my previous
summary of the issue and some possible solutions[0]. Seems no on
actually replied last time, which is why we went with the "do nothing"
approach, I suppose. I'm including the full text of the original email
below; please take a look, and let's see if we can converge on a
consensus here.

First off, a problem description: If an existing XDP program is exposed
to an xdp_buff that is really a multi-buffer, while it will continue to
run, it may end up with subtle and hard-to-debug bugs: If it's parsing
the packet it'll only see part of the payload and not be aware of that
fact, and if it's calculating the packet length, that will also only be
wrong (only counting the first fragment).

So what to do about this? First of all, to do anything about it, XDP
programs need to be able to declare themselves "multi-buffer aware" (but
see point 1 below). We could try to auto-detect it in the verifier by
which helpers the program is using, but since existing programs could be
perfectly happy to just keep running, it probably needs to be something
the program communicates explicitly. One option is to use the
expected_attach_type to encode this; programs can then declare it in the
source by section name, or the userspace loader can set the type for
existing programs if needed.

With this, the kernel will know if a given XDP program is multi-buff
aware and can decide what to do with that information. For this we came
up with basically three options:

1. Do nothing. This would make it up to users / sysadmins to avoid
   anything breaking by manually making sure to not enable multi-buffer
   support while loading any XDP programs that will malfunction if
   presented with an mb frame. This will probably break in interesting
   ways, but it's nice and simple from an implementation PoV. With this
   we don't need the declaration discussed above either.

2. Add a check at runtime and drop the frames if they are mb-enabled and
   the program doesn't understand it. This is relatively simple to
   implement, but it also makes for difficult-to-understand issues (why
   are my packets suddenly being dropped?), and it will incur runtime
   overhead.

3. Reject loading of programs that are not MB-aware when running in an
   MB-enabled mode. This would make things break in more obvious ways,
   and still allow a userspace loader to declare a program "MB-aware" to
   force it to run if necessary. The problem then becomes at what level
   to block this?
I think there's another potential problem with this as well: what happens to
already loaded programs that are not MB-aware? Are they forcibly unloaded?
I'd say probably the opposite: You can't toggle whatever switch we end
up with if there are any non-MB-aware programs (you'd have to unload
them first)...
How would we communicate that issue? dmesg? I'm not very familiar with
how sysctl change failure causes are communicated to users, so this
might be a solved problem, but if I run `sysctl -w net.xdp.multibuffer
1` (or whatever ends up actually being the toggle) to active
multi-buffer, and it fails because there's a loaded non-aware program,
that seems like a potential for a lot of administrator pain.
Hmm, good question. Document that this only fails if there's a
non-mb-aware XDP program loaded? Or use some other mechanism with better
feedback?
quoted
quoted
quoted
quoted
   Doing this at the driver level is not enough: while a particular
   driver knows if it's running in multi-buff mode, we can't know for
   sure if a particular XDP program is multi-buff aware at attach time:
   it could be tail-calling other programs, or redirecting packets to
   another interface where it will be processed by a non-MB aware
   program.

   So another option is to make it a global toggle: e.g., create a new
   sysctl to enable multi-buffer. If this is set, reject loading any XDP
   program that doesn't support multi-buffer mode, and if it's unset,
   disable multi-buffer mode in all drivers. This will make it explicit
   when the multi-buffer mode is used, and prevent any accidental subtle
   malfunction of existing XDP programs. The drawback is that it's a
   mode switch, so more configuration complexity.
Could we combine the last two bits here into a global toggle that doesn't
require a sysctl? If any driver is put into multi-buffer mode, then the system
switches to requiring all programs be multi-buffer? When the last multi-buffer
enabled driver switches out of multi-buffer, remove the system-wide
restriction?
Well, the trouble here is that we don't necessarily have an explicit
"multi-buf mode" for devices. For instance, you could raise the MTU of a
device without it necessarily involving any XDP multi-buffer stuff (if
you're not running XDP on that device). So if we did turn "raising the
MTU" into such a mode switch, we would end up blocking any MTU changes
if any XDP programs are loaded. Or having an MTU change cause a
force-unload of all XDP programs.
Maybe I missed something then, but you had stated that "while a
particular driver knows if it's running in multi-buff mode" so I
assumed that the driver would be able to tell when to toggle the mode
on.
Well, a driver knows when it is attaching an XDP program whether it (the
driver) is configured in a way such that this XDP program could
encounter a multi-buf.
quoted
I had been thinking that when a driver turned multi-buffer off, it
could trigger a check of all drivers, but that also seems like it
could just be a global refcount of all the drivers that have requested
multi-buffer mode. When a driver enables multi-buffer for itself, it
increments the refcount, and when it disables, it decrements. A
non-zero count means the system is in multi-buffer mode.
I guess we could do a refcount-type thing when an multi-buf XDP program
is first attached (as per above). But I think it may be easier to just
do it at load-time, then, so it doesn't have to be in the driver, but
the BPF core could just enforce it.

This would basically amount to a rule saying "you can't mix mb-aware and
non-mb-aware programs", and the first type to be loaded determines which
mode the system is in. This would be fairly simple to implement and
enforce, I suppose. The drawback is that it's potentially racy in the
order programs are loaded...

-Toke
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help