Thread (33 messages) 33 messages, 6 authors, 2016-12-04

Re: Regression: [PATCH] mlx4: give precise rx/tx bytes/packets counters

From: Saeed Mahameed <hidden>
Date: 2016-12-01 16:34:13

On Thu, Dec 1, 2016 at 5:55 PM, Eric Dumazet [off-list ref] wrote:
On Thu, 2016-12-01 at 17:38 +0200, Saeed Mahameed wrote:
quoted
Hi Eric, Thanks for the patch, I already acked it.
Thanks !
quoted
I have one educational question (not related to this patch, but
related to stats reading in general).
I was wondering why do we need to disable bh every time we read stats
"spin_lock_bh" ? is it essential ?

I checked and in mlx4 we don't hold stats_lock in softirq
(en_rx.c/en_tx.c), so I don't see any deadlock risk in here..
Excellent question, and I chose to keep the spinlock.

That would be doable, only if we do not overwrite dev->stats.

Current code is :

static struct rtnl_link_stats64 *
mlx4_en_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
{
        struct mlx4_en_priv *priv = netdev_priv(dev);

        spin_lock_bh(&priv->stats_lock);
        mlx4_en_fold_software_stats(dev);
        netdev_stats_to_stats64(stats, &dev->stats);
        spin_unlock_bh(&priv->stats_lock);

        return stats;
}

If you remove the spin_lock_bh() :


static struct rtnl_link_stats64 *
mlx4_en_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
{
        struct mlx4_en_priv *priv = netdev_priv(dev);

        mlx4_en_fold_software_stats(dev); // possible races

        netdev_stats_to_stats64(stats, &dev->stats);

        return stats;
}

1) one mlx4_en_fold_software_stats(dev) could be preempted
on a CONFIG_PREEMPT kernel, or interrupted by long irqs.

2) Another cpu would also call mlx4_en_fold_software_stats(dev) while
   first cpu is busy.

3) Then when resuming first cpu/thread, part of the dev->stats fieds
would be updated with 'old counters',
while another thread might have updated them with newer values.

4) A SNMP reader could then get counters that are not monotonically
increasing,
which would be confusing/buggy.

So removing the spinlock is doable, but needs to add a new parameter
to mlx4_en_fold_software_stats() and call netdev_stats_to_stats64()
before mlx4_en_fold_software_stats(dev)

static struct rtnl_link_stats64 *
mlx4_en_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
{
        struct mlx4_en_priv *priv = netdev_priv(dev);

        netdev_stats_to_stats64(stats, &dev->stats);

        // Passing a non NULL stats asks mlx4_en_fold_software_stats()
        // to not update dev->stats, but stats directly.

        mlx4_en_fold_software_stats(dev, stats)


        return stats;
}
Thanks for the detailed answer !!

BTW you went 5 steps ahead of my original question :)), so far you
already have a patch without locking at all (really impressive).

What i wanted to ask originally, was regarding the "_bh", i didn't
mean to completely remove the "spin_lock_bh",
I meant, what happens if we replace "spin_lock_bh"  with "spin_lock",
without disabling bh ?
I gues raw "sping_lock" handles points (2 to 4) from above, but it
won't handle long irqs.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help