Thread (13 messages) 13 messages, 3 authors, 2024-06-10

Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb

From: Yan Zhai <hidden>
Date: 2024-06-06 15:37:59
Also in: linux-trace-kernel, lkml

On Wed, Jun 5, 2024 at 6:57 PM Steven Rostedt [off-list ref] wrote:
On Tue, 4 Jun 2024 14:47:38 -0700
Yan Zhai [off-list ref] wrote:
quoted
skb does not include enough information to find out receiving
sockets/services and netns/containers on packet drops. In theory
skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
stack for OOO packet lookup. Similarly, skb->sk often identifies a local
sender, and tells nothing about a receiver.

Allow passing an extra receiving socket to the tracepoint to improve
the visibility on receiving drops.

Signed-off-by: Yan Zhai <redacted>
---
v2->v3: fixed drop_monitor function prototype
---
 include/trace/events/skb.h | 11 +++++++----
 net/core/dev.c             |  2 +-
 net/core/drop_monitor.c    |  9 ++++++---
 net/core/skbuff.c          |  2 +-
 4 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 07e0715628ec..aa6b46b6172c 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN)
 TRACE_EVENT(kfree_skb,

      TP_PROTO(struct sk_buff *skb, void *location,
-              enum skb_drop_reason reason),
+              enum skb_drop_reason reason, struct sock *rx_sk),

-     TP_ARGS(skb, location, reason),
+     TP_ARGS(skb, location, reason, rx_sk),

      TP_STRUCT__entry(
              __field(void *,         skbaddr)
              __field(void *,         location)
              __field(unsigned short, protocol)
              __field(enum skb_drop_reason,   reason)
+             __field(void *,         rx_skaddr)
Please add the pointer after the other pointers:

                __field(void *,         skbaddr)
                __field(void *,         location)
+               __field(void *,         rx_skaddr)
                __field(unsigned short, protocol)
                __field(enum skb_drop_reason,   reason)

otherwise you are adding holes in the ring buffer event.

The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We
want to avoid alignment holes. I also question having a short before the
enum, if the emum is 4 bytes. The short should be at the end.

In fact, looking at the format file, there is a 2 byte hole:

 # cat /sys/kernel/tracing/events/skb/kfree_skb/format

name: kfree_skb
ID: 1799
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:void * skbaddr;   offset:8;       size:8; signed:0;
        field:void * location;  offset:16;      size:8; signed:0;
        field:unsigned short protocol;  offset:24;      size:2; signed:0;
        field:enum skb_drop_reason reason;      offset:28;      size:4; signed:0;

Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts
at offset 28. This means at offset 26, there's a 2 byte hole.
The reason I added the pointer as the last argument is trying to
minimize the surprise to existing TP users, because for common ABIs
it's fine to omit later arguments when defining a function, but it
needs change and recompilation if the order of arguments changed.

Looking at the actual format after the change, it does not add a new
hole since protocol and reason are already packed into the same 8-byte
block, so rx_skaddr starts at 8-byte aligned offset:

# cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format
name: kfree_skb
ID: 2260
format:
        field:unsigned short common_type;       offset:0;
size:2; signed:0;
        field:unsigned char common_flags;       offset:2;
size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;
 size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:void * skbaddr;   offset:8;       size:8; signed:0;
        field:void * location;  offset:16;      size:8; signed:0;
        field:unsigned short protocol;  offset:24;      size:2; signed:0;
        field:enum skb_drop_reason reason;      offset:28;
size:4; signed:0;
        field:void * rx_skaddr; offset:32;      size:8; signed:0;

Do you think we still need to change the order?

Yan

-- Steve


quoted
      ),

      TP_fast_assign(
@@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb,
              __entry->location = location;
              __entry->protocol = ntohs(skb->protocol);
              __entry->reason = reason;
+             __entry->rx_skaddr = rx_sk;
      ),
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help