Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest
From: Koehrer Mathias (ETAS/ESW5) <hidden>
Date: 2016-09-23 11:40:51
Hi Sebastian,
thanks for the feedback.quoted
quoted
I run the cyclictest with the following options: # cyclictest -a -i 100 -d 10 -m -n -t -p 80there is -S. And then 100 might be a little tight.quoted
Of course the 2 minutes run-time of cyclictest is only a rough first estimate.and with no load…quoted
Once I configure one of the i350 ports # ifconfig eth2 up 192.168.100.100 the cyclictest shows directly and reproducibly significant larger max latency values (40 microseconds, using the sameconditions).quoted
I did the very same test with kernel version 3.18.27-rt27. With that version I did not see anything like that. Also, only the igb driver seems to cause the trouble. I have also an e1000e based NIC in this PC and the usage of this driver does not add anysignificant latency.quoted
Any idea on this?Does this also happen if you have the NIC up and you plug in / out the cable? There are two things that come to mind: https://lkml.kernel.org/r/1445465268-10347-1-git-send-email- jonathan.david@ni.com https://lkml.kernel.org/r/1445886895-3692-1-git-send-email-joshc@ni.co mThis happens even if I have done "ifconfig up" on the NIC without having a cable plugged in. Also, it happens if I have a cable plugged in and the link is up but no traffic is running via this NIC port. It looks as if solely the configured NIC port is causing the additional latency, no matter if traffic is flowing via this NIC or not and no matter if the link is up or not. I did the same test with the kernel/rt_preempt patch versions 4.1.33-rt37 and 4.4.19-rt27, they show the very same behavior. In opposite to that, the version 3.18.27-rt27 is working stable! As mentioned before, the "igb" driver is causing the issue. The "e1000e" driver works fine.
I did some further analysis. The code that is causing the long latencies seems to be the function "igb_watchdog_task" within igb_main.c (Line: 4386). This function will be called periodically. When I do a return at the beginning of this function the additional latency is not seen. In particular that function calls "igb_has_link" which seems to be one candidate that is causing additional latency. Do you have any clue how this code can be executed properly without causing the additional latencies? Regards Mathias