Thread (34 messages) 34 messages, 3 authors, 2023-10-18

Re: [PATCH v14 net-next 08/23] net/tcp: Add AO sign to RST packets

From: Dmitry Safonov <hidden>
Date: 2023-10-18 15:45:04
Also in: lkml, oe-lkp

Hi Oliver,

On 10/18/23 08:32, Oliver Sang wrote:
hi, Dmitry Safonov,

On Wed, Oct 18, 2023 at 12:19:56AM +0100, Dmitry Safonov wrote:
quoted
On 10/17/23 09:37, kernel test robot wrote:
quoted

Hello,

kernel test robot noticed "BUG:sleeping_function_called_from_invalid_context_at_net/core/sock.c" on:

commit: df13d11e6a2a3cc5f973aca36f68f880fa42d55f ("[PATCH v14 net-next 08/23] net/tcp: Add AO sign to RST packets")
[..]
quoted
quoted
[  221.348247][ T7133] BUG: sleeping function called from invalid context at net/core/sock.c:2978
[  221.349875][ T7133] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 7133, name: trinity-c4
[  221.351666][ T7133] preempt_count: 0, expected: 0
[  221.352614][ T7133] RCU nest depth: 1, expected: 0
[  221.353518][ T7133] 2 locks held by trinity-c4/7133:
[ 221.354530][ T7133] #0: ed8b5660 (sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sendmsg (net/ipv4/tcp.c:1336) 
[ 221.374314][ T7133] #1: c27dbb18 (rcu_read_lock){....}-{1:2}, at: inet6_csk_xmit (include/linux/rcupdate.h:747 net/ipv6/inet6_connection_sock.c:129) 
[  221.375906][ T7133] CPU: 1 PID: 7133 Comm: trinity-c4 Tainted: G        W       TN 6.6.0-rc4-01105-gdf13d11e6a2a #1
Quite puzzling. It doesn't seem that it can happen to inet6_csk_xmit():

:	rcu_read_lock();
:	skb_dst_set_noref(skb, dst);
:
:	/* Restore final destination back after routing done */
:	fl6.daddr = sk->sk_v6_daddr;
:
:	res = ip6_xmit(sk, skb, &fl6, sk->sk_mark, rcu_dereference(np->opt),
:		       np->tclass,  sk->sk_priority);
:	rcu_read_unlock();

So, I presumed the calltrace was for nested rcu_read_lock() case.
Then I've looked at all return/goto cases, I couldn't find any
unbalanced rcu_read_{,un}lock().

Is this reproducible by any chance?
do you mean how often it could be reproduced?

we run the tests upon this commit and parent both up to 15 times.
from below, parent shows quite clean, but there are various issues while
running tests upon this patch.
Thanks a lot for running your tests over it!

That's what a fresh brain does:
I see that tcp_v6_send_reset() with neither CONFIG_TCP_MD5SIG nor
CONFIG_TCP_AO set does grab rcu_read_lock() [as I wasn't sure at that
moment that nothing but md5 key does rely on that being held], but the
exit is covered with ifdefs:

: #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
: out:
:        if (allocated_traffic_key)
:                kfree(key.traffic_key);
:        rcu_read_unlock();
: #endif

Seems to have been correct versions ago (version 6), but on refactoring
when I've moved ifdeffs to tcp_parse_auth_options() declaration to
pollute less code - missed this.

Thanks again for your testing and reporting!

8468a6f4f3143 net/tcp: Add tcp_parse_auth_options()
df13d11e6a2a3 net/tcp: Add AO sign to RST packets

8468a6f4f3143ba2 df13d11e6a2a3cc5f973aca36f6
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :15          87%          13:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_arch/x86/entry/common.c
           :15          33%           5:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_fs/dcache.c
           :15          13%           2:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_include/linux/percpu-rwsem.h
           :15          60%           9:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_include/linux/sched/mm.h
           :15           7%           1:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c
           :15           7%           1:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c
           :15           7%           1:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/sched/completion.c
           :15          40%           6:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/task_work.c
           :15          33%           5:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_lib/iov_iter.c
           :15          33%           5:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_lib/strncpy_from_user.c
           :15          20%           3:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_lib/usercopy.c
           :15          13%           2:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_mm/vmscan.c
           :15         100%          15:15    dmesg.BUG:sleeping_function_called_from_invalid_context_at_net/core/sock.c
           :15         100%          15:15    dmesg.EIP:rcu_note_context_switch
           :15         100%          15:15    dmesg.WARNING:at_kernel/rcu/tree_plugin.h:#rcu_note_context_switch
           :15         100%          15:15    dmesg.WARNING:lock_held_when_returning_to_user_space
           :15         100%          15:15    dmesg.is_leaving_the_kernel_with_locks_still_held
[..]

Thanks,
           Dmitry
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help