Thread (51 messages) 51 messages, 13 authors, 2005-03-31

Re: netif_rx packet dumping

From: Stephen Hemminger <hidden>
Date: 2005-03-03 21:21:43

On 03 Mar 2005 16:18:08 -0500
jamal [off-list ref] wrote:
On Thu, 2005-03-03 at 15:55, David S. Miller wrote:
quoted
On Thu, 3 Mar 2005 12:38:11 -0800
Stephen Hemminger [off-list ref] wrote:
quoted
The existing throttling algorithm causes all packets to be dumped
(until queue emptys) when the packet backlog reaches
netdev_max_backog. I suppose this is some kind of DoS prevention
mechanism. The problem is that this dumping action creates mulitple
packet loss that forces TCP back to slow start.

But, all this is really moot for the case of any reasonably high speed
device because of NAPI. netif_rx is not even used for any device that
uses NAPI.  The NAPI code path uses net_receive_skb and the receive
queue management is done by the receive scheduling (dev->quota) of the
rx_scheduler.
Even without NAPI, netif_rx() ends up using the quota etc. machanisms
when the queue gets processed via process_backlog().

ksoftirqd should handle cpu starvation issues at a higher level.

I think it is therefore safe to remove the netif_max_backlog stuff
altogether.  "300" is such a non-sense setting, especially for gigabit
drivers which aren't using NAPI for whatever reason.  It's even low
for a system with 2 100Mbit devices.
A couple of issues with this
- the rx softirq uses netif_max_backlog as a contraint on how long to
run before yielding. Could probably fix by having a different variable.
It may be fair to decouple those two in any case.
- if you dont put a restriction on how many netif_rx packets get queued
then it is more than likely you will run into an OOM case for non-NAPI
drivers under interupt overload. Could probably resolve this by
increasing the backlog size to several TCP window sizes (handwaving:
2?). What would be the optimal TCP window size in these big fat pipes
assuming real low RTT? 

I would say whoever is worried about this should use a NAPI driver;
otherwise you dont deserve that pipe!
My plan is to keep netif_max_backlog but bump it up to something bigger
by default. Maybe even autosize it based on memory available.  But
get rid of the "dump till empty" behaviour that screws over TCP.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help