Re: RFC: issues concerning the next NAPI interface

RFC: issues concerning the next NAPI interface · Jan-Bernd Themann <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Jan-Bernd Themann <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Stephen Hemminger <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Stevens <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Linas Vepstas <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Linas Vepstas <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Rick Jones <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Shirley Ma <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · James Chapman <jchapman@katalix.com> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · Jan-Bernd Themann <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · James Chapman <jchapman@katalix.com> · 2007-08-26
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · Jan-Bernd Themann <hidden> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · James Chapman <jchapman@katalix.com> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · Jan-Bernd Themann <hidden> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · James Chapman <jchapman@katalix.com> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · James Chapman <jchapman@katalix.com> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-27
Re: RFC: issues concerning the next NAPI interface · Linas Vepstas <hidden> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-24
Re: RFC: issues concerning the next NAPI interface · David Miller <davem@davemloft.net> · 2007-08-24

From: David Miller <davem@davemloft.net>
Date: 2007-08-24 21:38:07
Also in: linuxppc-dev, lkml

From: Jan-Bernd Themann <redacted>
Date: Fri, 24 Aug 2007 15:59:16 +0200

1) The current implementation of netif_rx_schedule, netif_rx_complete
   and the net_rx_action have the following problem: netif_rx_schedule
   sets the NAPI_STATE_SCHED flag and adds the NAPI instance to the poll_list.
   netif_rx_action checks NAPI_STATE_SCHED, if set it will add the device
   to the poll_list again (as well). netif_rx_complete clears the NAPI_STATE_SCHED.
   If an interrupt handler calls netif_rx_schedule on CPU 2
   after netif_rx_complete has been called on CPU 1 (and the poll function 
   has not returned yet), the NAPI instance will be added twice to the 
   poll_list (by netif_rx_schedule and net_rx_action). Problems occur when 
   netif_rx_complete is called twice for the device (BUG() called)

Indeed, this is the "who should manage the list" problem.
Probably the answer is that whoever transitions the NAPI_STATE_SCHED
bit from cleared to set should do the list addition.

Patches welcome :-)

3) On modern systems the incoming packets are processed very fast. Especially
   on SMP systems when we use multiple queues we process only a few packets
   per napi poll cycle. So NAPI does not work very well here and the interrupt 
   rate is still high. What we need would be some sort of timer polling mode 
   which will schedule a device after a certain amount of time for high load 
   situations. With high precision timers this could work well. Current
   usual timers are too slow. A finer granularity would be needed to keep the
   latency down (and queue length moderate).

This is why minimal levels of HW interrupt mitigation should be enabled
in your chip.  If it does not support this, you will indeed need to look
into using high resolution timers or other schemes to alleviate this.

I do not think it deserves a generic core networking helper facility,
the chips that can't mitigate interrupts are few and obscure.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help