Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: 2026-05-23 12:13:32
() On Sat, May 23, 2026 at 8:07 AM Jamal Hadi Salim [off-list ref] wrote:
On Fri, May 22, 2026 at 8:55 PM Jakub Kicinski [off-list ref] wrote:quoted
On Fri, 22 May 2026 13:01:49 -0400 Jamal Hadi Salim wrote:quoted
quoted
quoted
I must be missing something. What's the problem with changing the patch to pull headers instead? I mean - if we agree that this is where we'll end up - we should just do it now. It's the long standing kernel policy to "fix things right" instead of creating temporary fixes which then have to be reworked in -nextI may be the one missing things. You main concern is with this: + /* + * If the skb has shared frags the user is likely using zero-copy + * (e.g. sendfile). Those page frags may point to page-cache pages; + * writing into them would silently corrupt the page cache. + * Linearize so pedit operates on a private copy. + * TL;DR: if you want zero-copy, don't use pedit. + */ + if (skb_has_shared_frag(skb)) { + if (__skb_linearize(skb)) + goto bad; + } + i.e you want that gone, correct? And my concern was whether removing this still exposes things to exploit even if temporarily. Likely it wont. I have time, let me test the exploit with that code ifdef'ed out.Ok, it fixes the issue even i ifdef that out. We still need the little patchlet i sent ...More and more I feel like I'm completely missing the plot but for the portion of the problem that's "we are writing to frags" the fix I was trying to describe is:diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index bc20f08a2789..3a74cef58e17 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c@@ -414,7 +414,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { int offset = tkey->off; int hoffset = 0; - u32 *ptr, hdata; + u32 *ptr; u32 val; int rc;@@ -456,10 +456,9 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, goto bad; } - ptr = skb_header_pointer(skb, hoffset + offset, - sizeof(hdata), &hdata); - if (!ptr) + if (!pskb_may_pull(skb, hoffset + offset + sizeof(*ptr))) goto bad; + ptr = (u32 *)(skb->data + hoffset + offset); /* just do it, baby */ switch (cmd) { case TCA_PEDIT_KEY_EX_CMD_SET:@@ -474,8 +473,6 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, } *ptr = ((*ptr & tkey->mask) ^ val); - if (ptr == &hdata) - skb_store_bits(skb, hoffset + offset, ptr, 4); } goto done;I see you are trying to get rid of the skb_header_pointer() / skb_store_bits() piece. Sure looks cleaner if we must linearize.
The other thing (i may be over thinking) with pskb_may_pull is: if the data is already linear (in a clone), wouldn't we corrupt the shared linear data of the clone? cheers, jamal
My primary concern would be it would break negative offsets - which
that patchlet and much of the dance in the Rajat patch accounts for.
ex: pskb_may_pull len being an unsigned int implies hoffset + offset +
sizeof(*ptr) will be intepreted as a very large positive number
I dont have time to test but something like:
# Change Destination MAC to 00:11:22:33:44:55
# Offset -14 is the start of the Ethernet header (relative to IP header)
tc filter add dev eth0 ingress protocol ip flower \
action pedit munge offset -14 u32 set 0x00112233
cheers,
jamal