Thread (30 messages) 30 messages, 6 authors, 2008-12-30

RE: unsafe locks seen with netperf on net-2.6.29 tree

From: Peter Zijlstra <hidden>
Date: 2008-12-29 10:02:04
Subsystem: networking [general], networking [tcp], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Neal Cardwell, Linus Torvalds

On Sat, 2008-12-27 at 17:54 -0700, Tantilov, Emil S wrote:
Peter Zijlstra wrote:
quoted
On Sat, 2008-12-27 at 12:38 -0700, Tantilov, Emil S wrote:
quoted
Peter Zijlstra wrote:
quoted
quoted
index 9007ccd..a074d77 100644
--- a/include/linux/percpu_counter.h
+++ b/include/linux/percpu_counter.h
@@ -30,8 +30,16 @@ struct percpu_counter {
 #define FBC_BATCH    (NR_CPUS*4)
 #endif

-int percpu_counter_init(struct percpu_counter *fbc, s64 amount);
-int percpu_counter_init_irq(struct percpu_counter *fbc, s64
amount); +int __percpu_counter_init(struct percpu_counter *fbc, s64
amount, +                       struct lock_class_key *key); +
+#define percpu_counter_init(fbc, value)
\ +     do {
\ +             static struct lock_class_key __key;
\ +
\ +             __percpu_counter_init(fbc, value, &__key);
\ +     } while (0) +
 void percpu_counter_destroy(struct percpu_counter *fbc);
 void percpu_counter_set(struct percpu_counter *fbc, s64 amount);
 void __percpu_counter_add(struct percpu_counter *fbc, s64 amount,
quoted
This patch fails to compile:

mm/backing-dev.c: In function 'bdi_init':
mm/backing-dev.c:226: error: expected expression bedore 'do'
Ah indeed, stupid me...

Please try something like this instead of the above hunk:
@@ -30,8 +30,16 @@ struct percpu_counter {
 #define FBC_BATCH      (NR_CPUS*4)
 #endif

-int percpu_counter_init(struct percpu_counter *fbc, s64 amount);
-int percpu_counter_init_irq(struct percpu_counter *fbc, s64 amount);
+int __percpu_counter_init(struct percpu_counter *fbc, s64 amount,
+                         struct lock_class_key *key);
+
+#define percpu_counter_init(fbc, value)
\ +       ({
\ +               static struct lock_class_key __key;
\ +
\ +               __percpu_counter_init(fbc, value, &__key);
\ +       })
+
 void percpu_counter_destroy(struct percpu_counter *fbc);
 void percpu_counter_set(struct percpu_counter *fbc, s64 amount);
 void __percpu_counter_add(struct percpu_counter *fbc, s64 amount,
s32 batch)
With this compiled, but I still get the following:

[  435.632627] =================================
[  435.633030] [ INFO: inconsistent lock state ]
[  435.633037] 2.6.28-rc8-net-next-igbL #14
[  435.633040] ---------------------------------
[  435.633044] inconsistent {in-softirq-W} -> {softirq-on-W} usage.
[  435.633049] netperf/12669 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  435.633053]  (key#8){-+..}, at: [<ffffffff803691ac>] __percpu_counter_add+0x4a/0x6d
[  435.633068] {in-softirq-W} state was registered at:
[  435.633070]   [<ffffffffffffffff>] 0xffffffffffffffff
[  435.633078] irq event stamp: 988533
[  435.633080] hardirqs last  enabled at (988533): [<ffffffff80243712>] _local_bh_enable_ip+0xc8/0xcd
[  435.633088] hardirqs last disabled at (988531): [<ffffffff8024369e>] _local_bh_enable_ip+0x54/0xcd
[  435.633093] softirqs last  enabled at (988532): [<ffffffff804fc814>] sock_orphan+0x3f/0x44
[  435.633100] softirqs last disabled at (988530): [<ffffffff8056454d>] _write_lock_bh+0x11/0x3d
[  435.633107]
[  435.633108] other info that might help us debug this:
[  435.633110] 1 lock held by netperf/12669:
[  435.633112]  #0:  (sk_lock-AF_INET6){--..}, at: [<ffffffff804fc544>] lock_sock+0xb/0xd
[  435.633119]
[  435.633120] stack backtrace:
[  435.633124] Pid: 12669, comm: netperf Not tainted 2.6.28-rc8-net-next-igbL #14
[  435.633127] Call Trace:
[  435.633134]  [<ffffffff8025ffb8>] print_usage_bug+0x159/0x16a
[  435.633139]  [<ffffffff8026000e>] valid_state+0x45/0x52
[  435.633143]  [<ffffffff802601cf>] mark_lock_irq+0x1b4/0x27b
[  435.633148]  [<ffffffff80260339>] mark_lock+0xa3/0x110
[  435.633152]  [<ffffffff80260480>] mark_irqflags+0xda/0xf2
[  435.633157]  [<ffffffff8026122e>] __lock_acquire+0x1c3/0x2ee
[  435.633161]  [<ffffffff80261d93>] lock_acquire+0x55/0x71
[  435.633166]  [<ffffffff803691ac>] ? __percpu_counter_add+0x4a/0x6d
[  435.633170]  [<ffffffff80564434>] _spin_lock+0x2c/0x38
[  435.633175]  [<ffffffff803691ac>] ? __percpu_counter_add+0x4a/0x6d
[  435.633179]  [<ffffffff803691ac>] __percpu_counter_add+0x4a/0x6d
[  435.633184]  [<ffffffff804fc827>] percpu_counter_add+0xe/0x10
[  435.633188]  [<ffffffff804fc837>] percpu_counter_inc+0xe/0x10
[  435.633193]  [<ffffffff804fdc91>] tcp_close+0x157/0x2da
[  435.633197]  [<ffffffff8051907e>] inet_release+0x58/0x5f
[  435.633204]  [<ffffffff80527c48>] inet6_release+0x30/0x35
[  435.633213]  [<ffffffff804c9354>] sock_release+0x1a/0x76
[  435.633221]  [<ffffffff804c9804>] sock_close+0x22/0x26
[  435.633229]  [<ffffffff802a345a>] __fput+0x82/0x110
[  435.633234]  [<ffffffff802a381a>] fput+0x15/0x17
[  435.633239]  [<ffffffff802a09c5>] filp_close+0x67/0x72
[  435.633246]  [<ffffffff80240ae3>] close_files+0x66/0x8d
[  435.633251]  [<ffffffff80240b39>] put_files_struct+0x19/0x42
[  435.633256]  [<ffffffff80240b98>] exit_files+0x36/0x3b
[  435.633260]  [<ffffffff80241eec>] do_exit+0x1b7/0x2b1
[  435.633265]  [<ffffffff80242087>] sys_exit_group+0x0/0x14
[  435.633269]  [<ffffffff80242099>] sys_exit_group+0x12/0x14
[  435.633275]  [<ffffffff8020b9cb>] system_call_fastpath+0x16/0x1b
Afaict this is a real deadlock.

All the code around that percpu_counter_inc() in tcp_close() seems to
disabled softirqs, which suggests a softirq can really happen there.

So either we disable softirqs around the percpu op, or we move it a few
lines down, like:

---
Subject: net: fix tcp deadlock

Lockdep spotted that the percpu counter op takes a lock so that softirq
recursion deadlocks can occur. Delay the op until we've disabled
softirqs to avoid this.

Signed-off-by: Peter Zijlstra <redacted>
---
 net/ipv4/tcp.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 564d3a9..03eddf1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1835,12 +1835,10 @@ adjudge_to_death:
 	state = sk->sk_state;
 	sock_hold(sk);
 	sock_orphan(sk);
-	percpu_counter_inc(sk->sk_prot->orphan_count);
 
 	/* It is the last release_sock in its life. It will remove backlog. */
 	release_sock(sk);
 
-
 	/* Now socket is owned by kernel and we acquire BH lock
 	   to finish close. No need to check for user refs.
 	 */
@@ -1848,6 +1846,9 @@ adjudge_to_death:
 	bh_lock_sock(sk);
 	WARN_ON(sock_owned_by_user(sk));
 
+	/* account the orphan state now that we have softirqs disabled.  */
+	percpu_counter_inc(sk->sk_prot->orphan_count);
+
 	/* Have we already been destroyed by a softirq or backlog? */
 	if (state != TCP_CLOSE && sk->sk_state == TCP_CLOSE)
 		goto out;

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help