Thread (34 messages) 34 messages, 10 authors, 2016-10-18

Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

From: Koehrer Mathias (ETAS/ESW5) <hidden>
Date: 2016-09-23 11:40:51

Hi Sebastian,
thanks for the feedback.
quoted
quoted
I run the cyclictest with the following options:
# cyclictest -a -i 100 -d 10 -m -n -t -p 80
there is -S. And then 100 might be a little tight.
quoted
Of course the 2 minutes run-time of cyclictest is only a rough first estimate.
and with no load…
quoted
Once I configure one of the i350 ports # ifconfig eth2 up
192.168.100.100 the cyclictest shows directly and reproducibly
significant larger max latency values (40 microseconds, using the
same
conditions).
quoted

I did the very same test with kernel version 3.18.27-rt27.
With that version I did not see anything like that.

Also, only the igb driver seems to cause the trouble. I have also an
e1000e based NIC in this PC and the usage of this driver does not
add any
significant latency.
quoted
Any idea on this?
Does this also happen if you have the NIC up and you plug in / out the
cable? There are two things that come to mind:
  https://lkml.kernel.org/r/1445465268-10347-1-git-send-email-
jonathan.david@ni.com

https://lkml.kernel.org/r/1445886895-3692-1-git-send-email-joshc@ni.co
m
This happens even if I have done "ifconfig up" on the NIC without having a cable
plugged in.
Also, it happens if I have a cable plugged in and the link is up but no traffic is running
via this NIC port.
It looks as if solely the configured NIC port is causing the additional latency, no
matter if traffic is flowing via this NIC or not and no matter if the link is up or not.

I did the same test with the kernel/rt_preempt patch versions
4.1.33-rt37 and 4.4.19-rt27, they show the very same behavior.
In opposite to that, the version 3.18.27-rt27 is working stable!

As mentioned before, the "igb" driver is causing the issue. The "e1000e" driver works
fine.
I did some further analysis.
The code that is causing the long latencies seems to be the 
function "igb_watchdog_task" within igb_main.c (Line: 4386). 
This function will be called periodically.
When I do a return at the beginning of this function the additional latency is not seen.
In particular that function calls "igb_has_link" which seems to be one candidate that is
causing additional latency.
Do you have any clue how this code can be executed properly without causing the
additional latencies?

Regards

Mathias
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help