Thread (9 messages) 9 messages, 4 authors, 2023-01-10

Re: [External] Re: [PATCH v4] sock: add tracepoint for send recv length

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2023-01-09 15:42:13
Also in: lkml, netdev

On Mon, 9 Jan 2023 16:20:58 +0100
Eric Dumazet [off-list ref] wrote:
On Mon, Jan 9, 2023 at 4:08 PM Steven Rostedt [off-list ref] wrote:
quoted
On Mon, 9 Jan 2023 15:54:38 +0100
Eric Dumazet [off-list ref] wrote:
 
quoted
quoted
static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
{
        int ret = INDIRECT_CALL_INET(sock->ops->sendmsg, inet6_sendmsg,
                                     inet_sendmsg, sock, msg,
                                     msg_data_left(msg));
        BUG_ON(ret == -EIOCBQUEUED);

        if (trace_sock_send_length_enabled()) {  
A barrier() is needed here, with the current state of affairs.

IMO, ftrace/x86 experts should take care of this generic issue ?  
trace_*_enabled() is a static_branch() (aka. jump label).

It's a nop, where the if block is in the out-of-line code and skipped. When
the tracepoint is enabled, it gets turned into a jump to the if block
(which returns back to this location).
 
This is not a nop, as shown in the generated assembly, I copied in
this thread earlier.
But I thought that was for the patch before the added noinline helper
function to force the tracepoint into its own function, and this now just
has the static branch.
Compiler does all sorts of things before the point the static branch
is looked at.

Let's add the extract again with <<*>> tags on added instructions/dereferences.
Ug, bad line wrapping
sock_recvmsg_nosec:
        pushq   %r12    #
        movl    %edx, %r12d     # tmp123, flags
        pushq   %rbp    #
# net/socket.c:999:     int ret =
INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
        movl    %r12d, %ecx     # flags,
# net/socket.c:998: {
        movq    %rdi, %rbp      # tmp121, sock
        pushq   %rbx    #
# net/socket.c:999:     int ret =
INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
        movq    32(%rdi), %rax  # sock_19(D)->ops, sock_19(D)->ops
# ./include/linux/uio.h:270:    return i->count;
        movq    32(%rsi), %rdx  # MEM[(const struct iov_iter
*)msg_20(D) + 16B].count, pretmp_48
# net/socket.c:999:     int ret =
INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
        movq    144(%rax), %rax # _1->recvmsg, _2
        cmpq    $inet6_recvmsg, %rax    #, _2
        jne     .L107   #,
        call    inet6_recvmsg   #
 <<*>>       movl    %eax, %ebx      # tmp124, <retval>
.L108:
# net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
  <<*>>      xorl    %r8d, %r8d      # tmp127
   <<*>>     testl   %ebx, %ebx      # <retval>
# net/socket.c:1004:                           sock->sk->sk_protocol,
 <<*>>       movq    24(%rbp), %rsi  # sock_19(D)->sk, _10
# net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
 <<*>>       cmovle  %ebx, %r8d      # <retval>,, tmp119
  <<*>>      testb   $2, %r12b       #, flags
# net/socket.c:1004:                           sock->sk->sk_protocol,
  <<*>>      movzwl  516(%rsi), %ecx # _10->sk_protocol,
# net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
  <<*>>      movzwl  16(%rsi), %edx  # _10->__sk_common.skc_family,
# net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
  <<*>>      cmove   %ebx, %r8d      # tmp119,, <retval>, iftmp.54_16
# ./arch/x86/include/asm/jump_label.h:27:       asm_volatile_goto("1:"
#APP
# 27 "./arch/x86/include/asm/jump_label.h" 1
        1:jmp .L111 # objtool NOPs this         #
        .pushsection __jump_table,  "aw"
         .balign 8
        .long 1b - .
        .long .L111 - .         #
         .quad __tracepoint_sock_recv_length+8 + 2 - .  #,
        .popsection

# 0 "" 2
#NO_APP
.L106:
# net/socket.c:1008: }
 <<*>>       movl    %ebx, %eax      # <retval>,
        popq    %rbx    #
        popq    %rbp    #
        popq    %r12    #
        ret
.L111:
# ./include/trace/events/sock.h:308: DEFINE_EVENT(sock_msg_length,
sock_recv_length,
quoted
That is, when the tracepoint in the block gets enabled so does the above
branch. Sure, there could be a race between the two being enabled, but I
don't see any issue if there is. But the process to modify the jump labels,
does a bunch of synchronization between the CPUs.

What barrier are you expecting?  
Something preventing the compiler being 'smart', forcing expression evaluations
before TP_fast_assign() is eventually called.
There's no good way to generically keep the compiler from being 'smart',
that I know of. That's because the tracepoint injection is defined by:

#define __DECLARE_TRACE(name, proto, args, cond, data_proto)		\
	extern int __traceiter_##name(data_proto);			\
	DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name);	\
	extern struct tracepoint __tracepoint_##name;			\
	static inline void __trace_##name(proto)			\
	{								

That (proto) is the prototype being passed in. And because macros can't
create other macros, I don't know how to have a way to inject a barrier()
before and after that call, or better yet, to have the prototype hidden
behind the static_branch.


But looking at this tracepoint again, I see a issue that can help with the
dereferencing.

Why is family and protocol passed in?

+	trace_sock_send_length(sock->sk, sock->sk->sk_family,
+			       sock->sk->sk_protocol, ret, 0);


Where the TP_fast_assign() is:

+	TP_fast_assign(
+		__entry->sk = sk;
+		__entry->family = sk->sk_family;
+		__entry->protocol = sk->sk_protocol;
+		__entry->length = ret > 0 ? ret : 0;
+		__entry->error = ret < 0 ? ret : 0;
+		__entry->flags = flags;
+	),

The family and protocol is taken from the sk, and not the parameters. I bet
dropping those would help.

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help