Re: [PATCH for-next v6 1/8] RDMA/rxe: Replace RB tree by xarray for indexes
From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2021-12-09 00:18:59
On Wed, Dec 08, 2021 at 06:16:21PM -0600, Bob Pearson wrote:
On 12/7/21 13:09, Jason Gunthorpe wrote:quoted
On Mon, Dec 06, 2021 at 03:12:36PM -0600, Bob Pearson wrote:quoted
if (pool->flags & RXE_POOL_INDEX) { - pool->index.tree = RB_ROOT; - err = rxe_pool_init_index(pool, info->max_index, - info->min_index); - if (err) - goto out; + xa_init_flags(&pool->xarray.xa, XA_FLAGS_ALLOC); + pool->xarray.limit.max = info->max_index; + pool->xarray.limit.min = info->min_index; + } else { + /* if pool not indexed just use xa spin_lock */ + spin_lock_init(&pool->xarray.xa.xa_lock);xarray's don't cost anything to init, so there is no reason to do something like this.OKquoted
quoted
+/* drop a reference to an object */ +static inline bool __rxe_drop_ref(struct rxe_pool_elem *elem) +{ + bool ret; + + rxe_pool_lock_bh(elem->pool); + ret = kref_put(&elem->ref_cnt, rxe_elem_release); + rxe_pool_unlock_bh(elem->pool);This is a bit strange, why does something need to hold a lock around a kref?This also relates to your comment on 8/8 patch. There seems to be a race opportunity between the call to kref_put(&obj->elem, rxe_elem_release) and the call in rxe_elem_release() to xa_erase(). If a duplicate or delayed packet arrives after the the final kref_put() and before the xa_erase() one can still lookup the object from its index (qpn, rkey, etc.) and take a reference to it. The use of kref_get_unless_zero and locking around kref_put and __xa_erase was an attempt to fix this. Once you call kref_put with the ref count going to zero there is no way to prevent the object getting its cleanup routine called.
This is why you use kref_get_not_zero only during the xa lookup, then during that race it returns failure Costs nothing as the atomics are already all required. Jason