Thread (288 messages) 288 messages, 13 authors, 2014-04-01

Re: mlx4 netpoll and rx/tx weirdness

From: Eric Dumazet <hidden>
Date: 2014-03-16 16:17:04

On Sat, 2014-03-15 at 13:01 -0700, Eric W. Biederman wrote:
Given this comment I have to ask:  How insane is the mellanox mlx4
driver that has separate rx and tx queues, separate rx and tx interrupts
and uses separate napi bottom halves to process each, and honors the
budget passed into it's rx napi handler.
I believe this driver had TX completion from hard irq, and NAPI polling
for RX.

This was changed quite recently to also do TX completion from softirq
handler.
quoted hunk ↗ jump to hunk
Right now the mlx4 code disables interrupts and calls napi_synchronize
in it's netpoll routine instead of just scheduling the napi bottom
halves as all of the sane drivers do.  napi_synchronize trying to sleep
in hard irq context is pretty terrible, but I can see roughly how that
should be fixed.
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index fa5ee719e04b..2e6fded14e60 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1302,16 +1302,10 @@ out:
 static void mlx4_en_netpoll(struct net_device *dev)
 {
        struct mlx4_en_priv *priv = netdev_priv(dev);
-       struct mlx4_en_cq *cq;
-       unsigned long flags;
        int i;
 
        for (i = 0; i < priv->rx_ring_num; i++) {
-               cq = priv->rx_cq[i];
-               spin_lock_irqsave(&cq->lock, flags);
-               napi_synchronize(&cq->napi);
-               mlx4_en_process_rx_cq(dev, cq, 0);
-               spin_unlock_irqrestore(&cq->lock, flags);
+               napi_schedule(&priv->tx_cq[i]->napi);
        }
 }
Wait a minute, I thought ndo_poll_controller() had to be synchronous,
not schedule a napi ?
What I can't see is what is a clean thing to do with the mlx4 tx bottom
napi bottom half.  As it won't processes the tx cq when I pass in a
budget of 0.
Just ignore the budget to drain tx queues, as David said.
What I can't see is how to prevent a netpoll stalling after enough
packets are transmitted from hard irq context say with sysrq-l on
a machine with enough cpus to fill the tx queue.
Is this specific to mlx4 ?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help