Thread (7 messages) 7 messages, 4 authors, 2016-07-06

RE: [PATCH net] net: poll tx timeout only on active tx queues

From: Yuval Mintz <hidden>
Date: 2016-07-06 06:43:00

quoted
quoted
quoted
currently all the device driver call
netif_tx_start_all_queues(dev) on open to W/A this issue. which is
strange since only real_num_tx_queues are active.
You could also argue that netif_tx_start_all_queues() should only
enable the real_num_tx_queues.
[Although that would obviously cause all drivers to reach the
'problem' you're currently fixing].
Yep. Basically what I pointed out.

It seems inconsistent to have loops using num_tx_queues, and others
using real_num_tx_queues.

Instead of 'fixing' one of them, we should take a deeper look, even if
the change looks fine.

num_tx_queues should be used in code that runs once, like
netdev_lockdep_set_classes(), but other loops should probably use
real_num_tx_queues.

Anyway all these changes should definitely target net-next, not net
tree.
But for the long term, you have a point.
We will consider a deeper fix for net-next as you suggested, and drop this
temporary fix.
I think we've actually managed to hit an issue with qede [& modified bnx2x]
due to netif_tx_start_all_queues() starting all Tx-queues - 
While reducing the number of channels on an interface driver reloads
following which the xmit function receives an SKB using a too-high txq.

Investigation seem to indicate that some TCP traffic arrived during the
reload, got enqueued on the qdisc with high txq and then got transmitted
as-is after re-enabling tx.
[Removing the modulo from bnx2x's select_queue() lead to same issue.]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help