Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: 2026-05-26 12:01:25
On Tue, May 26, 2026 at 5:53 AM David Laight [off-list ref] wrote:
On Mon, 18 May 2026 20:39:50 -0700 Rajat Gupta [off-list ref] wrote:quoted
tcf_pedit_act() computes the COW range for skb_ensure_writable() once before the key loop using tcfp_off_max_hint, but the hint does not account for the runtime header offset added by typed keys. This can leave part of the write region un-COW'd. Fix by moving skb_ensure_writable() inside the per-key loop where the actual write offset is known, and add overflow checking on the offset arithmetic. For negative offsets (e.g. Ethernet header edits at ingress), use skb_cow() to COW the headroom instead. Guard offset_valid() against INT_MIN, where negation is undefined. Additionally, linearize skbs with shared frags upfront to prevent silent data corruption when pedit operates on zero-copy pages (e.g. from sendfile). Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") Reported-by: Rajat Gupta <redacted> Reported-by: Yiming Qian <redacted> Reported-by: Keenan Dong <redacted> Reported-by: Han Guidong <redacted> Reported-by: Zhang Cen <redacted> Tested-by: Han Guidong <redacted> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Rajat Gupta <redacted> --- net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- 1 file changed, 41 insertions(+), 13 deletions(-)diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index bc20f08a2..79921b8d8 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c@@ -16,6 +16,7 @@ #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/slab.h> +#include <linux/overflow.h> #include <net/ipv6.h> #include <net/netlink.h> #include <net/pkt_sched.h>@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) if (offset > 0 && offset > skb->len) return false; - if (offset < 0 && -offset > skb_headroom(skb)) - return false; + if (offset < 0) { + if (offset == INT_MIN || -offset > skb_headroom(skb)) + return false; + }Can't you negate skb_headroom() instead - that cannot be INT_MIN. So: if (offset < 0 && offset < -skb_headroom(skb))
You mean something like this? if (offset < 0 && offset < -(int)skb_headroom(skb)) That does feel cleaner, yes.
quoted
return true; }@@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, struct tcf_pedit_key_ex *tkey_ex; struct tcf_pedit_parms *parms; struct tc_pedit_key *tkey; - u32 max_offset; int i; parms = rcu_dereference_bh(p->parms); - max_offset = (skb_transport_header_was_set(skb) ? - skb_transport_offset(skb) : - skb_network_offset(skb)) + - parms->tcfp_off_max_hint; - if (skb_ensure_writable(skb, min(skb->len, max_offset))) - goto done; + /* + * If the skb has shared frags the user is likely using zero-copy + * (e.g. sendfile). Those page frags may point to page-cache pages; + * writing into them would silently corrupt the page cache. + * Linearize so pedit operates on a private copy. + * TL;DR: if you want zero-copy, don't use pedit. + */ + if (skb_has_shared_frag(skb)) { + if (__skb_linearize(skb)) + goto bad; + }Should there be a way of 'unsharing' frags by just copying the frags rather than doing a full linearize? That would be much less likely to fail for big skb.
It has been agreed that this chunk is unnecessary, so it will be removed in the next update. cheers, jamal