Thread (1 message) 1 message, 1 author, 2015-01-26

Re: [PATCH FIX For-3.19 v5 00/10] Fix ipoib regressions

From: Doug Ledford <hidden>
Date: 2015-01-26 19:30:21

Possibly related (same subject, not in this thread)

On Mon, 2015-01-26 at 15:24 +0200, Erez Shitrit wrote:
On 1/26/2015 2:51 PM, Doug Ledford wrote:
quoted
On Mon, 2015-01-26 at 12:27 +0200, Erez Shitrit wrote:
quoted
New (and full) dmesg attached, (after modprobe ib_ipoib, with all debug
flags set) it is all there.
Thank you, I know what's going on here now.  Will correct shortly.
welcome -:)
I munged my opensm configuration so that I could forcibly replicate the
situation here (I intentionally took several well known multicast groups
and forbid their creation).

I was able to first replicate Eriz's problem.

Then I installed a new ib_ipoib module with my proposed fix for Erez's
problem and it worked exactly as expected.  It was a mistake in one of
my earlier patches (the third in the series).  When I added a delayed
queue of the task thread, I didn't have a separate work struct and
instead tried to queue the same work struct twice.  I reworked it so
that the work struct is only ever queued once and if the multicast task
gets to the end of its run and there are delayed entries waiting still,
it will queue itself to run again when the shortest delay has expired.
I'll send that through.

Here's the log of the attempt:

[root@rdma-master linus (firewall/for-rc)]$ dmesg | tail -10
[337072.429488] mlx4_ib0: successfully joined all multicast groups
[337073.856932] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting sendonly join
[337073.869686] mlx4_ib0: sendonly multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22
[337073.882754] mlx4_ib0: successfully joined all multicast groups
[337088.480082] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
[337088.492789] mlx4_ib0: sendonly multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
[337088.505819] mlx4_ib0: successfully joined all multicast groups
[337089.897041] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting sendonly join
[337089.909870] mlx4_ib0: sendonly multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22
[337089.922893] mlx4_ib0: successfully joined all multicast groups
[root@rdma-master linus (firewall/for-rc)]$ ping6 -I mlx4_ib0 fe80::211:7500:77:d3cc
PING fe80::211:7500:77:d3cc(fe80::211:7500:77:d3cc) from fe80::f652:1403:7b:cba1 mlx4_ib0: 56 data bytes
64 bytes from fe80::211:7500:77:d3cc: icmp_seq=1 ttl=64 time=77.6 ms
64 bytes from fe80::211:7500:77:d3cc: icmp_seq=2 ttl=64 time=0.159 ms
64 bytes from fe80::211:7500:77:d3cc: icmp_seq=3 ttl=64 time=0.125 ms
64 bytes from fe80::211:7500:77:d3cc: icmp_seq=4 ttl=64 time=0.128 ms
^C
--- fe80::211:7500:77:d3cc ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3001ms
rtt min/avg/max/mdev = 0.125/19.503/77.600/33.542 ms
[root@rdma-master linus (firewall/for-rc)]$ dmesg | tail -10[337120.632427] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
[337120.645166] mlx4_ib0: sendonly multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
[337120.658292] mlx4_ib0: successfully joined all multicast groups
[337121.977733] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting sendonly join
[337121.990478] mlx4_ib0: sendonly multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22
[337122.003589] mlx4_ib0: successfully joined all multicast groups
[337130.410559] mlx4_ib0: setting up send only multicast group for ff12:601b:ffff:0000:0000:0001:ff77:d3cc
[337130.423203] mlx4_ib0: no multicast record for ff12:601b:ffff:0000:0000:0001:ff77:d3cc, starting sendonly join
[337130.436327] mlx4_ib0: MGID ff12:601b:ffff:0000:0000:0001:ff77:d3cc AV ffff882027235f00, LID 0xc01e, SL 0
[337130.448970] mlx4_ib0: successfully joined all multicast groups
[root@rdma-master linus (firewall/for-rc)]$ 


-- 
Doug Ledford [off-list ref]
              GPG KeyID: 0E572FDD

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help