Thread (52 messages) 52 messages, 6 authors, 2019-02-14

Re: net: phylink: dsa: mv88e6xxx: flaky link detection on switch ports with internal PHYs

From: John David Anglin <hidden>
Date: 2019-01-30 19:01:54

On 2019-01-30 12:28 p.m., Andrew Lunn wrote:
On Wed, Jan 30, 2019 at 12:08:39PM -0500, John David Anglin wrote:
quoted
On 2019-01-22 7:22 p.m., Andrew Lunn wrote:
quoted
quoted
From my Espressobin
cat /proc/interrupts
...
 44:          0          0  mv88e6xxx-g1   3 Edge      mv88e6xxx-g1-atu-prob
 46:          0          0  mv88e6xxx-g1   5 Edge      mv88e6xxx-g1-vtu-prob
 48:         38         24  mv88e6xxx-g1   7 Edge      mv88e6xxx-g2
 51:          0          1  mv88e6xxx-g2   1 Edge      !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:11
 52:          0          0  mv88e6xxx-g2   2 Edge      !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:12
 53:         38         23  mv88e6xxx-g2   3 Edge      !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:13

These are PHY interrupts.
If we come back to my trying to use the INTn pin on the esspressobin, I
have found that clearing and resetting the interrupt
enable bits in the global control register (offset 0x4) restarts link
detection when the device is stuck.  This suggests that the
INTn connection to MPP2_23 is low when the the GIC interrupt is enabled
on this pin.  Possibly, this is caused by the fact
that EEIntEn is set to 1 on reset.  INTn then goes low when EEPROM
loading is done.  Another possibility might be race conditions
in processing interrupts.

Thoughts?
Hi David

You need active low interrupts. Without it, i think you are always
going to have race conditions which will cause interrupts to get
stuck/lost.

I would suggest you remove the interrupt from your device tree and use
the mv88e6xxx polling method. If i remember correctly, it currently
polls 10 per second, so PHY link up/down is going to be 5 times faster
on average than when phylib is polling the PHY.
Hi Andrew,

The main motivation in doing this is to try to enable the AVB interrupt
and to improve the PTP support.
I agree that polling is perfectly fine for PHY link interrupts. 
Possibly, ATU and VTU might need faster
support but I'm not using that at the moment.

I have hacked on the time stamp code quit a bit to try and improve
things but there are still issues with
lost or overwritten time stamps:

Jan 28 11:15:05 localhost kernel: [234850.840883] mv88e6085
d0032004.mdio-mii:01
: timestamp discarded
Jan 28 11:15:05 localhost ptp4l: [234852.998] port 3: received SYNC
without time
stamp

I think when PTP packets other than PDELAY are too closely spaced, we
have problems accessing
the timestamp quick enough.  Also, timestamp access is dependent on CPU
speed and HZ.

It looks like I can easily connect MPP2_23 to MPP1_16 on the edge
connector P8.  I believe the northbridge
pins support level interrupts.

In /proc/interrupts, the switch interrupts are shown as edge.  The only
place that I see where this
is potentially set is mv88e6xxx_g2_watchdog_setup() where the call to
request_threaded_irq() passes
"IRQF_ONESHOT | IRQF_TRIGGER_FALLING".  Does this need to change?

Dave

-- 
John David Anglin  dave.anglin@bell.net

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help