Thread (24 messages) 24 messages, 7 authors, 2005-04-01

Re: [RFC] netif_rx: receive path optimization

From: jamal <hidden>
Date: 2005-03-31 21:38:04

On Thu, 2005-03-31 at 16:24, Rick Jones wrote:
quoted
The repurcassions of going from per-CPU-for-all-devices queue
(introduced by softnet) to per-device-for-all-CPUs maybe huge in my
opinion especially in SMP. A closer view of whats there now maybe
per-device-per-CPU backlog queue.
I think performance will be impacted in all devices. imo, whatever needs
to go in needs to have some experimental data to back it
Indeed.

At the risk of again chewing on my toes (yum), if multiple CPUs are pulling 
packets from the per-device queue there will be packet reordering. 
;-> This happens already _today_ on Linux on non-NAPI.

Take the following scenario in non-NAPI. 
-packet 1 arrives 
-interupt happens, NIC bound to CPU0
- in the meantime packets 2,3 arrive
- 3 packets put on queue for CPU0
- interupt processing done

- packet 4 arrives, interupt, CPU1 is bound to NIC
- in the meantime packets 5,6 arrive
- CPU1 backlog queue used.
- interupt processing done

Assume CPU0 is overloaded with other systenm work and CPU1 rx processing
kicks in first ... 
TCP sees packet 4, 5, 6 before 1, 2, 3 ..

Note Linux is quiet resilient to reordering compared to other OSes (as
you may know) but avoiding this is a better approach - hence my
suggestion to use NAPI when you want to do serious TCP.

Of course NAPI is not all that panacea under low traffic eating a little
bit more CPU (but you have CPU issues under low load you are in some
other deep shit)
 HP-UX 10.0 
did just that and it was quite nasty even at low CPU counts (<=4).  It was 
changed by HP-UX 10.20 (ca 1995) to per-CPU queues with queue selection computed 
from packet headers (hash the IP and TCP/UDP header to pick a CPU) It was called 
IPS for Inbound Packet Scheduling.  11.0 (ca 1998) later changed that to "find 
where the connection last ran and queue to that CPU" That was called TOPS - 
Thread Optimized Packet Scheduling.
Dont think we can do that unfortunately: We are screwed by the APIC
architecture on x86.

cheers,
jamal
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help