Re: [PATCH bpf-next v2 1/5] bpf: allow variable-offset stack access

From: Alexei Starovoitov <hidden>
Date: 2021-02-02 00:23:48

On Sat, Jan 30, 2021 at 05:55:36PM -0500, Andrei Matei wrote:

Thanks for reviewing this!

On Wed, Jan 27, 2021 at 5:58 PM Alexei Starovoitov
[off-list ref] wrote:

quoted

On Sun, Jan 24, 2021 at 02:49:05PM -0500, Andrei Matei wrote:

quoted

+ *
+ * If some stack slots in range are uninitialized (i.e. STACK_INVALID), the
+ * write is not automatically rejected. However, they are left as
+ * STACK_INVALID, which means that reads with the same variable offset will be
+ * rejected.

...

quoted

+             /* If the slot is STACK_INVALID, we leave it as such. We can't
+              * mark the slot as initialized, as the slot might not actually
+              * be written to (and so marking it as initialized opens the
+              * door to leaks of uninitialized stack memory.
+              */
+             if (*stype != STACK_INVALID)
+                     *stype = new_type;

'leaks of uninitialized stack memory'...
well that's true, but the way the user would have to deal with this
is to use __builtin_memset(&buf, 0, 16); for the whole buffer
before doing var_off write into it.
In the test in patch 5 would be good to add a read after this write:
buf[len] = buf[idx];
// need to do another read of buf[len] here.

Without memset() it would fail and the user would flame us:
"I just wrote into this stack slot!! Why cannot the verifier figure out
that the read from the same location is safe?... stupid verifier..."

I think for your use case where you read the whole thing into a stack and
then parse it all locations potentially touched by reads/writes would
be already written via helper, but imo this error is too unpleasant to
explain to users.
Especially since it happens later at a different instruction there is
no good verifier hint we can print.
It would just hit 'invalid read from stack'.

Currently we don't allow uninit read from stack even for root.
I think we have to sacrifice the perfection of the verification here.
We can either allow reading uninit for _fixed and _var_off
or better yet do unconditional '*stype = new_type' here.
Yes it would mean that following _fixed or _var_off read could be
reading uninited stack. I think we have to do it for the sake
of user friendliness.
The completely buggy uninited reads from stack will still be disallowed.
Only the [min,max] of var_off range in stack will be considered
init, so imo it's a reasonable trade-off between user friendliness
and verifier's perfection.
Wdyt?

I'm happy to do whatever you tell me. But, I dunno, the verifier
currently seems to be paranoid in ways I don't even understand (around
speculative execution). In comparison, preventing trivial leaks of
uninitialized memory seems relatively important. We're only talking
about root here (as you've noted), and other various checks are less
paranoid for root, so maybe it's no big deal. Where does the stack
memory come from? Can it be *any* previously used kernel memory?
A few possible alternatives (without necessarily knowing what I'm
talking about):
1) Perhaps it wouldn't be a big deal to zero-initialize all the stack
memory (up to 512 bytes) for a program. Is that out of the question?
In many cases it'd be less than 512 bytes; the verifier knows the max
stack needed. If the stack was always initialized, various verifier
checks could go away.

Even if stack usage is small, like 64 byte, bzero of it has noticeable
perf penalty.

2) We could leave this patch as is, and work on improving the error
you get on rejected stack reads after var-offset stack writes. I think
the verifier could track what stack slots might have or might have not
been written to, and when a read to such an uncertain slot is
rejected, it could point to the previous var-off write (or one of the
possibly many such writes) and suggest a memset. Sounds potentially
complicated though.

too complicated imo.

3) Perhaps we could augment helper metadata with information about
whether each helper promises to overwrite every byte in the buffer
it's being passed in. This wouldn't solve the general usability
problem we're discussing, but it would make common cases just work.
Namely, bpf_probe_read_user() can make the required promise.
bpf_probe_read_user_str(), however, could not.

eventually yes. That's orthogonal to this patch set.

But, again, if you think relaxing the verification is OK, I'm very
happy to do that.

The tracing progs can read stack with bpf_probe_read_kernel anyway, so
I would prefer to relax it to improve ease of use.
Unconditional *stype = new_type; here would do the trick.
Not for unpriv, of course.
Probably another 'bool' flag (instead of jumbo allow_ptr_leaks)
like 'allow_uninit_stack' that is set with perfmon_capable().

quoted

              tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
-             verbose(env, "variable stack access var_off=%s off=%d size=%d\n",
+             verbose(env, "var-offset stack reads only permitted to register; var_off=%s off=%d size=%d\n",

The message is confusing. "read to register"? what is "read to not register" ?
Users won't be able to figure out that it means helpers access.
Also nowadays it means that atomic ops won't be able to use var_off into stack.
I think both limitations are ok for now. Only the message needs to be clear.

What message would you suggest? Would you plumb information about what
the read type is (helper vs atomic op)?

Something like: "variable offset stack pointer cannot be passed into helper functions" ?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help