Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb
From: Yan Zhai <hidden>
Date: 2024-06-06 15:37:59
Also in:
linux-trace-kernel, lkml
On Wed, Jun 5, 2024 at 6:57 PM Steven Rostedt [off-list ref] wrote:
On Tue, 4 Jun 2024 14:47:38 -0700 Yan Zhai [off-list ref] wrote:quoted
skb does not include enough information to find out receiving sockets/services and netns/containers on packet drops. In theory skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP stack for OOO packet lookup. Similarly, skb->sk often identifies a local sender, and tells nothing about a receiver. Allow passing an extra receiving socket to the tracepoint to improve the visibility on receiving drops. Signed-off-by: Yan Zhai <redacted> --- v2->v3: fixed drop_monitor function prototype --- include/trace/events/skb.h | 11 +++++++---- net/core/dev.c | 2 +- net/core/drop_monitor.c | 9 ++++++--- net/core/skbuff.c | 2 +- 4 files changed, 15 insertions(+), 9 deletions(-)diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h index 07e0715628ec..aa6b46b6172c 100644 --- a/include/trace/events/skb.h +++ b/include/trace/events/skb.h@@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN) TRACE_EVENT(kfree_skb, TP_PROTO(struct sk_buff *skb, void *location, - enum skb_drop_reason reason), + enum skb_drop_reason reason, struct sock *rx_sk), - TP_ARGS(skb, location, reason), + TP_ARGS(skb, location, reason, rx_sk), TP_STRUCT__entry( __field(void *, skbaddr) __field(void *, location) __field(unsigned short, protocol) __field(enum skb_drop_reason, reason) + __field(void *, rx_skaddr)Please add the pointer after the other pointers: __field(void *, skbaddr) __field(void *, location) + __field(void *, rx_skaddr) __field(unsigned short, protocol) __field(enum skb_drop_reason, reason) otherwise you are adding holes in the ring buffer event. The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We want to avoid alignment holes. I also question having a short before the enum, if the emum is 4 bytes. The short should be at the end. In fact, looking at the format file, there is a 2 byte hole: # cat /sys/kernel/tracing/events/skb/kfree_skb/format name: kfree_skb ID: 1799 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:void * skbaddr; offset:8; size:8; signed:0; field:void * location; offset:16; size:8; signed:0; field:unsigned short protocol; offset:24; size:2; signed:0; field:enum skb_drop_reason reason; offset:28; size:4; signed:0; Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts at offset 28. This means at offset 26, there's a 2 byte hole.
The reason I added the pointer as the last argument is trying to
minimize the surprise to existing TP users, because for common ABIs
it's fine to omit later arguments when defining a function, but it
needs change and recompilation if the order of arguments changed.
Looking at the actual format after the change, it does not add a new
hole since protocol and reason are already packed into the same 8-byte
block, so rx_skaddr starts at 8-byte aligned offset:
# cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format
name: kfree_skb
ID: 2260
format:
field:unsigned short common_type; offset:0;
size:2; signed:0;
field:unsigned char common_flags; offset:2;
size:1; signed:0;
field:unsigned char common_preempt_count; offset:3;
size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:void * skbaddr; offset:8; size:8; signed:0;
field:void * location; offset:16; size:8; signed:0;
field:unsigned short protocol; offset:24; size:2; signed:0;
field:enum skb_drop_reason reason; offset:28;
size:4; signed:0;
field:void * rx_skaddr; offset:32; size:8; signed:0;
Do you think we still need to change the order?
Yan
-- Stevequoted
), TP_fast_assign(@@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb, __entry->location = location; __entry->protocol = ntohs(skb->protocol); __entry->reason = reason; + __entry->rx_skaddr = rx_sk; ),