Re: [PATCH bpf-next v2 1/5] bpf: allow variable-offset stack access
From: Alexei Starovoitov <hidden>
Date: 2021-02-02 00:23:48
On Sat, Jan 30, 2021 at 05:55:36PM -0500, Andrei Matei wrote:
Thanks for reviewing this! On Wed, Jan 27, 2021 at 5:58 PM Alexei Starovoitov [off-list ref] wrote:quoted
On Sun, Jan 24, 2021 at 02:49:05PM -0500, Andrei Matei wrote:quoted
+ * + * If some stack slots in range are uninitialized (i.e. STACK_INVALID), the + * write is not automatically rejected. However, they are left as + * STACK_INVALID, which means that reads with the same variable offset will be + * rejected....quoted
+ /* If the slot is STACK_INVALID, we leave it as such. We can't + * mark the slot as initialized, as the slot might not actually + * be written to (and so marking it as initialized opens the + * door to leaks of uninitialized stack memory. + */ + if (*stype != STACK_INVALID) + *stype = new_type;'leaks of uninitialized stack memory'... well that's true, but the way the user would have to deal with this is to use __builtin_memset(&buf, 0, 16); for the whole buffer before doing var_off write into it. In the test in patch 5 would be good to add a read after this write: buf[len] = buf[idx]; // need to do another read of buf[len] here. Without memset() it would fail and the user would flame us: "I just wrote into this stack slot!! Why cannot the verifier figure out that the read from the same location is safe?... stupid verifier..." I think for your use case where you read the whole thing into a stack and then parse it all locations potentially touched by reads/writes would be already written via helper, but imo this error is too unpleasant to explain to users. Especially since it happens later at a different instruction there is no good verifier hint we can print. It would just hit 'invalid read from stack'. Currently we don't allow uninit read from stack even for root. I think we have to sacrifice the perfection of the verification here. We can either allow reading uninit for _fixed and _var_off or better yet do unconditional '*stype = new_type' here. Yes it would mean that following _fixed or _var_off read could be reading uninited stack. I think we have to do it for the sake of user friendliness. The completely buggy uninited reads from stack will still be disallowed. Only the [min,max] of var_off range in stack will be considered init, so imo it's a reasonable trade-off between user friendliness and verifier's perfection. Wdyt?I'm happy to do whatever you tell me. But, I dunno, the verifier currently seems to be paranoid in ways I don't even understand (around speculative execution). In comparison, preventing trivial leaks of uninitialized memory seems relatively important. We're only talking about root here (as you've noted), and other various checks are less paranoid for root, so maybe it's no big deal. Where does the stack memory come from? Can it be *any* previously used kernel memory? A few possible alternatives (without necessarily knowing what I'm talking about): 1) Perhaps it wouldn't be a big deal to zero-initialize all the stack memory (up to 512 bytes) for a program. Is that out of the question? In many cases it'd be less than 512 bytes; the verifier knows the max stack needed. If the stack was always initialized, various verifier checks could go away.
Even if stack usage is small, like 64 byte, bzero of it has noticeable perf penalty.
2) We could leave this patch as is, and work on improving the error you get on rejected stack reads after var-offset stack writes. I think the verifier could track what stack slots might have or might have not been written to, and when a read to such an uncertain slot is rejected, it could point to the previous var-off write (or one of the possibly many such writes) and suggest a memset. Sounds potentially complicated though.
too complicated imo.
3) Perhaps we could augment helper metadata with information about whether each helper promises to overwrite every byte in the buffer it's being passed in. This wouldn't solve the general usability problem we're discussing, but it would make common cases just work. Namely, bpf_probe_read_user() can make the required promise. bpf_probe_read_user_str(), however, could not.
eventually yes. That's orthogonal to this patch set.
But, again, if you think relaxing the verification is OK, I'm very happy to do that.
The tracing progs can read stack with bpf_probe_read_kernel anyway, so I would prefer to relax it to improve ease of use. Unconditional *stype = new_type; here would do the trick. Not for unpriv, of course. Probably another 'bool' flag (instead of jumbo allow_ptr_leaks) like 'allow_uninit_stack' that is set with perfmon_capable().
quoted
quoted
tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); - verbose(env, "variable stack access var_off=%s off=%d size=%d\n", + verbose(env, "var-offset stack reads only permitted to register; var_off=%s off=%d size=%d\n",The message is confusing. "read to register"? what is "read to not register" ? Users won't be able to figure out that it means helpers access. Also nowadays it means that atomic ops won't be able to use var_off into stack. I think both limitations are ok for now. Only the message needs to be clear.What message would you suggest? Would you plumb information about what the read type is (helper vs atomic op)?
Something like: "variable offset stack pointer cannot be passed into helper functions" ?