Re: [PATCH] bpf: convert hashtab lock to raw lock
From: Shi, Yang <hidden>
Date: 2015-11-02 17:12:34
Also in:
linux-rt-users, lkml
On 10/31/2015 11:37 AM, Daniel Borkmann wrote:
On 10/31/2015 02:47 PM, Steven Rostedt wrote:quoted
On Fri, 30 Oct 2015 17:03:58 -0700 Alexei Starovoitov [off-list ref] wrote:quoted
On Fri, Oct 30, 2015 at 03:16:26PM -0700, Yang Shi wrote:quoted
When running bpf samples on rt kernel, it reports the below warning: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 1, irqs_disabled(): 128, pid: 477, name: ping Preemption disabled at:[<ffff80000017db58>] kprobe_perf_func+0x30/0x228...quoted
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 83c209d..972b76b 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c@@ -17,7 +17,7 @@ struct bpf_htab { struct bpf_map map; struct hlist_head *buckets; - spinlock_t lock; + raw_spinlock_t lock;How do we address such things in general? I bet there are tons of places around the kernel that call spin_lock from atomic. I'd hate to lose the benefits of lockdep of non-raw spin_lock just to make rt happy.You wont lose any benefits of lockdep. Lockdep still checks raw_spin_lock(). The only difference between raw_spin_lock and spin_lock is that in -rt spin_lock turns into an rt_mutex() and raw_spin_lock stays a spin lock.( Btw, Yang, would have been nice if your commit description would have already included such info, not only that you convert it, but also why it's okay to do so. )
I think Thomas's document will include all the information about rt spin lock/raw spin lock, etc. Alexei & Daniel, If you think such info is necessary, I definitely could add it into the commit log in v2.
quoted
The error is that in -rt, you called a mutex and not a spin lock while atomic.You are right, I think this happens due to the preempt_disable() in the trace_call_bpf() handler. So, I think the patch seems okay. The dep_map is btw union'ed in the struct spinlock case to the same offset of the dep_map from raw_spinlock. It's a bit inconvenient, though, when we add other library code as maps in future, f.e. things like rhashtable as they would first need to be converted to raw_spinlock_t as well, but judging from the git log, it looks like common practice.
Yes, it is common practice for converting sleepable spin lock to raw spin lock in -rt to avoid scheduling in atomic context bug. Thanks, Yang
Thanks, Daniel