Thread (8 messages) 8 messages, 3 authors, 2021-07-22

Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.

From: Leon Romanovsky <leon@kernel.org>
Date: 2021-07-22 12:57:46

On Thu, Jul 22, 2021 at 09:06:07AM -0300, Jason Gunthorpe wrote:
On Thu, Jul 22, 2021 at 10:43:00AM +0300, Leon Romanovsky wrote:
quoted
On Wed, Jul 21, 2021 at 04:51:55PM +0530, Dakshaja Uppalapati wrote:
quoted
Previous atomic increment decrement logic expects the atomic count
to be '0' after the final decrement. Replacing atomic count with
refcount does not allow that as refcount_dec() considers count of 1 as
underflow. Therefore fix the current refcount logic by decrementing
the refcount if one on the final deref in c4iw_destroy_cq().

Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting")
Signed-off-by: Dakshaja Uppalapati <redacted>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
 drivers/infiniband/hw/cxgb4/cq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Thanks, 
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

We have plenty of such errors, worth to check them:
➜  kernel git:(rdma-next) git grep refcount_read drivers/infiniband/ | grep -v WARN_ON
drivers/infiniband/core/device.c:	if (!refcount_read(&ib_dev->refcount))
drivers/infiniband/core/device.c:	if (refcount_read(&device->refcount) == 0 ||
drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
drivers/infiniband/core/ucma.c:	if (refcount_read(&ctx->ref))
drivers/infiniband/hw/cxgb4/cq.c:	wait_event(chp->wait, !refcount_read(&chp->refcnt));
drivers/infiniband/hw/irdma/utils.c:			   refcount_read(&cqp_request->refcnt) == 1, 1000);
drivers/infiniband/hw/mlx5/mlx5_ib.h:	wait_event(mmkey->wait, refcount_read(&mmkey->usecount) == 0);
drivers/infiniband/hw/mlx5/mr.c:	    refcount_read(&mr->mmkey.usecount) != 0 &&
It isn't the read that is the problem, it is the naked dec.

This common pattern is just being done in an obtuse and arguably wrong
way

It is supposed to look like this:

void ib_device_put(struct ib_device *device)
{
        if (refcount_dec_and_test(&device->refcount))
                complete(&device->unreg_completion);
}

[..]

        ib_device_put(device);
        wait_for_completion(&device->unreg_completion);


Not use refcount_read() and a naked work queue

So I'd say these ones should be looked at:

drivers/infiniband/hw/cxgb4/cq.c:       refcount_dec(&chp->refcnt);
drivers/infiniband/hw/hns/hns_roce_db.c:        refcount_dec(&db->u.user_page->refcount);
drivers/infiniband/hw/irdma/cm.c:       refcount_dec(&cm_node->refcnt);
drivers/infiniband/hw/irdma/cm.c:               refcount_dec(&listener->refcnt);
drivers/infiniband/hw/irdma/cm.c:                       refcount_dec(&listener->refcnt);
Jason,

We are talking about two different issues that this refcount_read patch pointed.
You are focused on wrong usage of completion, I saw useless compare of
refcount_t with 0 that can't be.

I prepared series that cleans iwpm from this type of error:
 drivers/infiniband/core/iwpm_util.c:        if (!refcount_read(&iwpm_admin.refcount)) {
 drivers/infiniband/core/iwpm_util.c:        if (!refcount_read(&iwpm_admin.refcount)) {

Thanks
Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help