Re: [PATCH] bpf: Tweak BPF jump table optimizations for objtool compatibility
From: Josh Poimboeuf <hidden>
Date: 2020-05-01 19:56:29
Also in:
bpf, lkml
On Fri, May 01, 2020 at 12:40:53PM -0700, Alexei Starovoitov wrote:
On Fri, May 01, 2020 at 02:22:04PM -0500, Josh Poimboeuf wrote:quoted
On Fri, May 01, 2020 at 12:09:30PM -0700, Alexei Starovoitov wrote:quoted
On Thu, Apr 30, 2020 at 02:07:43PM -0500, Josh Poimboeuf wrote:quoted
Objtool decodes instructions and follows all potential code branches within a function. But it's not an emulator, so it doesn't track register values. For that reason, it usually can't follow intra-function indirect branches, unless they're using a jump table which follows a certain format (e.g., GCC switch statement jump tables). In most cases, the generated code for the BPF jump table looks a lot like a GCC jump table, so objtool can follow it. However, with RETPOLINE=n, GCC keeps the jump table address in a register, and then does 160+ indirect jumps with it. When objtool encounters the indirect jumps, it can't tell which jump table is being used (or even whether they might be sibling calls instead). This was fixed before by disabling an optimization in ___bpf_prog_run(), using the "optimize" function attribute. However, that attribute is bad news. It doesn't append options to the command-line arguments. Instead it starts from a blank slate. And according to recent GCC documentation it's not recommended for production use. So revert the previous fix: 3193c0836f20 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()") With that reverted, solve the original problem in a different way by getting rid of the "goto select_insn" indirection, and instead just goto the jump table directly. This simplifies the code a bit and helps GCC generate saner code for the jump table branches, at least in the RETPOLINE=n case. But, in the RETPOLINE=y case, this simpler code actually causes GCC to generate far worse code, ballooning the function text size by +40%. So leave that code the way it was. In fact Alexei prefers to leave *all* the code the way it was, except where needed by objtool. So even non-x86 RETPOLINE=n code will continue to have "goto select_insn". This stuff is crazy voodoo, and far from ideal. But it works for now. Eventually, there's a plan to create a compiler plugin for annotating jump tables. That will make this a lot less fragile.I don't like this commit log. Here you're saying that the code recognized by objtool is sane and good whereas well optimized gcc code is somehow voodoo and bad. That is just wrong.I have no idea what you're talking about. Are you saying that ballooning the function text size by 40% is well optimized GCC code? It seems like a bug to me. That's the only place I said anything bad about GCC code.It could be a bug, but did you benchmark the speed of interpreter ? Is it faster or slower with 40% more code ? Did you benchmark it on other archs ?
I thought we were in agreement that 40% text growth is bad. Isn't that why you wanted to keep 'goto select_insn' for the retpoline case? If there's some other reason, let me know and I'll put it in the patch description instead.
quoted
When I said "this stuff is crazy voodoo" I was referring to the patch itself. I agree it's horrible, it's only the best approach we're able to come up with at the moment.please reword it then.
Ok, so: This *patch* is crazy voodoo ? -- Josh