Thread (26 messages) 26 messages, 6 authors, 2019-06-19

Re: [RT WARNING] DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current) with fsfreeze (4.19.25-rt16)

From: Peter Zijlstra <peterz@infradead.org>
Date: 2019-06-19 09:50:55
Also in: lkml

Sorry, I seem to have missed this email.

On Mon, May 06, 2019 at 06:50:09PM +0200, Oleg Nesterov wrote:
On 05/03, Peter Zijlstra wrote:
quoted
-static void lockdep_sb_freeze_release(struct super_block *sb)
-{
-	int level;
-
-	for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--)
-		percpu_rwsem_release(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
-}
-
-/*
- * Tell lockdep we are holding these locks before we call ->unfreeze_fs(sb).
- */
-static void lockdep_sb_freeze_acquire(struct super_block *sb)
-{
-	int level;
-
-	for (level = 0; level < SB_FREEZE_LEVELS; ++level)
-		percpu_rwsem_acquire(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
+	percpu_down_write_non_owner(sb->s_writers.rw_sem + level-1);
 }
I'd suggest to not change fs/super.c, keep these helpers, and even not introduce
xxx_write_non_owner().

freeze_super() takes other locks, it calls sync_filesystem(), freeze_fs(), lockdep
should know that this task holds SB_FREEZE_XXX locks for writing.
Bah, I so hate these games. But OK, I suppose.
quoted
@@ -80,14 +83,8 @@ int __percpu_down_read(struct percpu_rw_
 	 * and reschedule on the preempt_enable() in percpu_down_read().
 	 */
 	preempt_enable_no_resched();
-
-	/*
-	 * Avoid lockdep for the down/up_read() we already have them.
-	 */
-	__down_read(&sem->rw_sem);
+	wait_event(sem->waiters, !atomic_read(&sem->block));
 	this_cpu_inc(*sem->read_count);
Argh, this looks racy :/

Suppose that sem->block == 0 when wait_event() is called, iow the writer released
the lock.

Now suppose that this __percpu_down_read() races with another percpu_down_write().
The new writer can set sem->block == 1 and call readers_active_check() in between,
after wait_event() and before this_cpu_inc(*sem->read_count).

CPU0			CPU1			CPU2

percpu_up_write()
  sem->block = 0;

			__percpu_down_read()
			  wait_event(, !sem->block);

						percpu_down_write()
						  wait_event_exclusive(, xchg(sem->block,1)==0);
						  readers_active_check()

			  this_cpu_inc();

			  *whoopsy* reader while write owned.



I suppose we can 'patch' that by checking blocking again after we've
incremented, something like the below.

But looking at percpu_down_write() we have two wait_event*() on the same
queue back to back, which is 'odd' at best. Let me ponder that a little
more.


---
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -61,6 +61,7 @@ int __percpu_down_read(struct percpu_rw_
 	 * writer missed them.
 	 */
 
+again:
 	smp_mb(); /* A matches D */
 
 	/*
@@ -87,7 +88,13 @@ int __percpu_down_read(struct percpu_rw_
 	wait_event(sem->waiters, !atomic_read_acquire(&sem->block));
 	this_cpu_inc(*sem->read_count);
 	preempt_disable();
-	return 1;
+
+	/*
+	 * percpu_down_write() could've set ->blocked right after we've seen it
+	 * 0 but missed our this_cpu_inc(), which is exactly the condition we
+	 * get called for from percpu_down_read().
+	 */
+	goto again;
 }
 EXPORT_SYMBOL_GPL(__percpu_down_read);
 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help