Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Hannes Frederic Sowa <hidden>
Date: 2016-04-01 04:33:25
On 01.04.2016 06:26, Alexei Starovoitov wrote:
On Fri, Apr 01, 2016 at 06:12:49AM +0200, Hannes Frederic Sowa wrote:quoted
On 01.04.2016 06:04, Alexei Starovoitov wrote:quoted
On Thu, Mar 31, 2016 at 08:03:38PM -0700, Eric Dumazet wrote:quoted
On Thu, 2016-03-31 at 18:45 -0700, Alexei Starovoitov wrote:quoted
Eric, what's your take on Hannes's patch 2 ? Is it more accurate to ask lockdep to check for actual lock or lockdep can rely on owned flag? Potentially there could be races between setting the flag and actual lock... but that code is contained, so unlikely. Will we find the real issues with this 'stronger' check or just spend a ton of time adapting to new model like your other patch for release_sock and whatever may need to come next...More precise lockdep checks are certainly good, I only objected to 4/4 trying to work around another bug. But why do we rush for 'net' tree ? This looks net-next material to me. Locking changes are often subtle, lets take the time to do them properly.completely agree. I think only first patch belongs in net. Everything else is net-next material.Problem with first patch is that it uses lock_sock_fast, thus the current sock_owned_by_user check doesn't get rid the lockdep warning. :/ Thus we would need to go with the two first patches. Do you think it is acceptable? I actually didn't see a problem and testing showed no problems so far.I see. right. the patch 1 only makes sense when coupled with 2. but now I'm not so sure that lockdep_is_held(&sk->sk_lock.slock) is a valid check, since current sock_owned_by_user() is equivalent to lockdep_is_held(&sk->sk_lock) only. I would go with Daniel's approach. Much simpler to reason about.
IMHO we should treat sk_lock and sk_lock.slock the same as they are encapsulated by socket lock api. I was rather afraid that we call those changed functions from within release_sock and thus would have the same problem again, where we get splats because of the time where we actually have user ownership but not the mark in the lockdep data structures. But this seems not to be the case as the functions are only directly called on behalf of user space. Daniel, what do you think? I would be fine with your patch for net and we clean this up a bit in net-next then. Bye, Hannes