Thread (23 messages) 23 messages, 2 authors, 2022-09-08

Re: [PATCH net-next v5 1/8] net: phylink: Document MAC_(A)SYM_PAUSE

From: Sean Anderson <hidden>
Date: 2022-09-08 21:04:11
Also in: lkml


On 9/8/22 10:21 AM, Russell King (Oracle) wrote:
On Wed, Sep 07, 2022 at 06:39:34PM -0400, Sean Anderson wrote:
quoted
On 9/7/22 17:01, Russell King (Oracle) wrote:
quoted
On Wed, Sep 07, 2022 at 04:11:14PM -0400, Sean Anderson wrote:
quoted
On 9/7/22 2:04 PM, Russell King (Oracle) wrote:
Given that going from tx/rx back to pause/asym_dir bits is not trivial
(because the translation depends on the remote advertisement) it is
highly unlikely that the description would frame the support in terms
of whether the hardware can transmit and/or receive pause frames.
I think it is? Usually if both symmetric and asymmetric pause is
possible then there are two PAUSE_TX and PAUSE_RX fields in a register
somewhere. Similarly, if there is only symmetric pause, then there is a
PAUSE_EN bit in a register. And if only one of TX and RX is possible,
then there will only be one field. There are a few drivers where you
program the advertisement and let the hardware do the rest, but even
then there's usually a manual mode (which should be enabled by the
poorly-documented permit_pause_to_mac parameter).
The problem with "if there is only symmetric pause, then there is a
PAUSE_EN bit in a register" is that for a device that only supports
the ability to transmit pause, it would have a bit to enable the
advertisement of the ASM_DIR bit, and possibly also have a PAUSE_EN
bit in a register to enable the transmission of pause frames.

So if you look just at what bits there are to enable, you might
mistake a single pause bit to mean symmetric pause when it doesn't
actually support that mode.
Sure, but usually that is noted in the documentation.
Let's take this a step further. Let's say that a device only has the
capability to receive pause frames. How does that correspond with
the SYM (PAUSE) and ASYM (ASM_DIR) bits? The only state that provides
for receive-only mode is if both of these bits are set, but wait a
moment, for a device that supports independent control of transmit
and receive, it's exactly the same encoding!

Fundamentally, a device can not really be "only capable of receiving
pause frames" because there is no way to set the local advertisement
to indicate to the remote end that the local end can not send pause
frames.
Yup. Only half of the combinations can be expressed.
The next issue is... how do you determine that a MAC that supports
transmission and reception of pause frames has independent or common
control of those two functions? That determines whether ASM_DIR can
be set along with PAUSE.
This is why I suggested down below that we encode exactly that in the
mac caps.
So, trying to work back from whether tx and rx are supported to which
of PAUSE and ASM_DIR should be set is quite a non-starter.
quoted
However, it is not obvious (at least it wasn't to me)

- That MAC_SYM_PAUSE/MAC_ASYM_PAUSE control to the PAUSE and ASYM_DIR
  bits (when MLO_PAUSE_AN is set).
I'm not sure why, because the linkmodes that the MAC deals with in
its validate() callback determine what is supported and what is
advertised, and phylink_caps_to_linkmodes() which is used in the
implementation of this method does:

        if (caps & MAC_SYM_PAUSE)
                __set_bit(ETHTOOL_LINK_MODE_Pause_BIT, linkmodes);

        if (caps & MAC_ASYM_PAUSE)
                __set_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, linkmodes);

Were you not aware that these two ethtool link mode bits control
the advertisement?
Yes. I had to dig into the code to determine what these bits were for.
Since there is no documentation (which what this patch aims to address),
that really is the only option. Additionally, the terminology is
different from what IEEE uses (although IMO it better describes the
function of the bits).
quoted
- How MAC_*_PAUSE related to the resolved pause mode in mac_link_up.
quoted
Note from the table above, it is not possible to advertise that you
do not support transmission of pause frames.
Just don't set either of MAC_*_PAUSE :)

Of course, hardware manufacturers are hopefully aware that only half of
the possible combinations are supported and don't produce hardware with
capabilities that can't be advertised.
Well, having read a few (although limited) number of documents on
ethernet MACs, they tend to frame the support in terms of whether
symmetric pause being supported or just the whole lot. Given that
IEEE 802.3's starting point for pause frames is the advertisement
rather than whether the hardware supports transmission or
reception, I think it would be rather silly to specify it in terms
of the tx/rx support.

If one's reverse engineering, then I think it's reasonable that if
you determine what the capabilities of the hardware is, it's then
up to the reverse engineer to do the next step and consult 802.3
table 28B-3 and work out what the advertisement should be.
quoted
quoted
quoted
quoted
The following table lists the values of tx_pause and rx_pause which
might be requested in mac_link_up depending on the results of> autonegotiation (when MLO_PAUSE_AN is set):>
MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
============= ============== ======== ========
             0              0        0        0
             0              1        0        0>                                     1        0
             1              0        0        0
                                     1        1>             1              1        0        0
                                     0        1
                                     1        1

When MLO_PAUSE_AN is not set, any combination of tx_pause and> rx_pause may be requested. This depends on user configuration,
without regard to the values of MAC_SYM_PAUSE and MAC_ASYM_PAUSE.
The above is how I'm viewing this, and because of the broken formatting,
it's impossible to make sense of, sorry.
Sorry, my mail client mangled it. Second attempt:
quoted
MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
============= ============== ======== ========
            0              0        0        0
            0              1        0        0
                                    1        0
            1              0        0        0
                                    1        1
            1              1        0        0
                                    0        1
                                    1        1
That's fine for the autonegotiation resolution, but you originally stated
that your table was also for user-settings as well - and that's where I
originally took issue and still do.

As I've tried to explain, for a MAC that supports the MAC_SYM_PAUSE=1
MAC_ASYM_PAUSE=1 case, the full set of four states of tx_pause and
rx_pause are possible to configure when autoneg is disabled _even_
when there is no way to properly advertise it.
I assume you wrote this before reading the below.
The point of forcing the pause state is to override autonegotiation,
because maybe the autonegotiation state is wrong and you explicitly
want a particular configuration for the link.
quoted
quoted
So, if a MAC only supports symmetric pause, it can key off either of
these two flags as it is guaranteed that they will be identical in
for a MAC that only supports symmetric pause.
OK, so taking that into account then perhaps the post-table explanation
should be reworded to
quoted
When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is set, any
combination of tx_pause and rx_pause may be requested. This depends on
user configuration, without regard to the value of MAC_SYM_PAUSE. When
When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is also unset, then
tx_pause and rx_pause will still depend on user configuration, but
will always equal each other.
Or maybe the above table should be extended to be
quoted
MLO_PAUSE_AN MAC_SYM_PAUSE MAC_ASYM_PAUSE  tx_pause rx_pause
============ ============= ==============  ======== ========
           0             0              0         0        0
           0             0              1         0        0
                                                  1        0
           0             1              0         0        0
                                                  1        1
           0             1              1         0        0
                                                  0        1
                                                  1        1
           1             0              0         0        0
           1             X              1         X        X
           1             1              0         0        0
                                                  1        1
With a note like
quoted
When MLO_PAUSE_AN is not set, the values of tx_pause and rx_pause
depend on user configuration. When MAC_ASYM_PAUSE is not set, tx_pause
and rx_pause will be restricted to be either both enabled or both
disabled. Otherwise, no restrictions are placed on their values,
allowing configurations which would not be attainable as a result of
autonegotiation.
These options are what I propose to do with the table. I think these address
your concern that user-specified behavior was not documented properly. Upon
review, I think using the first table with the second note would be best.
quoted
IMO we should really switch to something like MAX_RX_PAUSE,
MAC_TX_PAUSE, MAC_RXTX_PAUSE and let phylink handle all the details of
turning that into sane advertisement.
I completely disagree for the technical example I gave above, where it
is impossible to advertise "hey, I support *only* receive pause". Also
it brings with it the issue that - does "MAC_RXTX_PAUSE" mean that the
MAC has independent control of transmit and receive pause frames, or
is it common.

I'm really sorry, but I think there are fundamental issues with trying
to frame the support in terms of "do we support transmission of pause
frames" and "do we support reception of pause frames" and working from
that back to an advertisement. The translation function from
capabilities to tx/rx enablement is a one-way translation - there is
no "good" reverse translation that doesn't involve ambiguity.
Of course. But this reflects what the hardware actually can do.
quoted
This would also let us return
-EINVAL in phylink_ethtool_set_pauseparam when the user requests e.g.
TX-only pause when the MAC only supports RX and RXTX.
As I've said, there is no way to advertise to the link partner that
RX-only is the only pause setting allowed, so it would be pretty
darn stupid for a manufacturer to design hardware with just that
capability..
Well, when the user specifies things we ignore the results of
autonegotiation. So a user could specify tx only on one end of a link
and rx only on the other end and have a working result which couldn't
be the result of autonegotiation. By specifying what the hardware
actually supports, phylink can determine whether what the user requests
is supported, without regard to whether it could be autonegotiated. At
the moment we allow the user to specify configurations which might not
be supported at all. There is no error message when this happens, so a
user can only discover this issue by reading the driver/datasheet or by
sniffing the link traffic.
quoted
quoted
Adding in the issue of rate adaption (sorry, I use "adaption" not
"adaptation" which I find rather irksome as in my - and apparently
a subsection of English speakers, the two have slightly different
meanings)
802.3 calls it "rate adaptation" in clause 49 (10GBASE-R) and "rate
matching" in clause 61 (10PASS-TL and 2BASE-TS). If you are opposed to
the former, then I think the latter could also work. It's also shorter,
which is definitely a plus.

Interestingly, wiktionary (with which I attempted to determine what that
slightly-different meaning was) labels "adaption" as "rare" :)
I'm aware of that, but to me (and others) adaption is something that is
on-going. Adaptation is what animals _have_ done to cope with a changing
environment.

For this feature, I much prefer "rate matching" which avoids this whole
issue of "adaption" vs "adaptation" - you may notice that when we were
originally discussing this, I was using "rate matching" terminology!
OK, I'll rename this in the next spin.
quoted
quoted
brings with it the problem that when using pause frames,
we need RX pause enabled, but on a MAC which only supports symmetric
pause, we can't enable RX pause without also transmitting pause frames.
So I would say such a setup was fundamentally mis-designed, and there's
little we can do to correct such a stupidity. Should we detect such
stupidities? Maybe, but what then? Refuse to function?
Previous discussion [1]. Right now we refuse to turn on rate adaptation
if the MAC only supports symmetric pause. The maximally-robust solution
would be to first try and autonegotiate with rate adaptation enabled and
using symmetric pause, and then renegotiate without either enabled. I
think that's significantly more complex, so I propose deferring such an
implementation to whoever first complains about their link not being
rate-adapted.
We can not get away from the fact that the only capabilities that a
MAC could advertise to say that it supports Rx-only pause mode is
one where it has both the PAUSE and ASM_DIR bits set. If it doesn't,
then, if you look at table 28B-3, there are no possible resolutions
to any other local advertisement state that result in Rx pause only
being enabled.
Well, what we really want to advertise is MLO_PAUSE_TXRX *without*
MLO_PAUSE_NONE. This is of course not possible to advertise, hence
the retry approach I suggested above.
Therefore, a MAC that only supports Rx pause would be incapable
of properly advertising that fact to the remote link partner and
is probably not conformant with 802.3.
Autonegotiation is optional for pause support. I agree that such an
implementation would be unusual.
I'll also point you to table 28B-2 "Pause encoding":

|   PAUSE (A5)   ASM_DIR (A6)                   Capability
|   0            0            No PAUSE
|   0            1            Asymmetric PAUSE toward link partner
|   1            0            Symmetric PAUSE
|   1            1            Both Symmetric PAUSE and Asymmetric PAUSE toward
|                             local device
|
| The PAUSE bit indicates that the device is capable of providing the
| symmetric PAUSE functions as defined# in Annex 31B. The ASM_DIR bit
| indicates that asymmetric PAUSE is supported. The value of the PAUSE
| bit when the ASM_DIR bit is set indicates the direction the PAUSE
| frames are supported for flow across the link. Asymmetric PAUSE
| configuration results in independent enabling of the PAUSE receive
| and PAUSE transmit functions as defined by Annex 31B. See 28B.3
| regarding PAUSE configuration resolution.

So here, the capabilities of the local device are couched in terms of
support for "symmetric pause" and "asymmetric pause" and not whether
they support transmission of pause frames and reception of pause frames.

I put it that the use of "is symmetric pause supported" and "is
asymmetric pause supported" by phylink is the right set of capabilities
that the MAC should be supplying, and not whether transmission and or
reception of pause frames is supported.
Well the funky bit is that one can say "I support *only* asymmetric pause"
which is pretty strange. By the above logic, devices supporting asymmetric
pause should be a strict subset of those supporting symmetric pause. And
yet it is not the case. IEEE has decided that this means tx-only devices.
We have some devices like this in Linux already (ksz8795, macb). IMO this
hijacking of meaning is precisely what needs to be documented, and also
makes the symmetric/asymmetric pause distinction less useful.
As I've pointed out, one can not go from tx and rx pause support to an
advertisement without ambiguity. That is why we can't advertise a
correct setting of PAUSE and ASM_DIR bits when using ethtool to
force a particular state of enables at the local end. To move to
using "is transmit pause supported" and "is receive pause supported"
will only _add_ ambiguity, and then we really do need documentation
to describe the behaviour we implement - because we then fall outside
of 802.3.
It removes ambiguity from the driver author's perspective. The ambiguity
then shifts to phylink_caps_to_linkmodes, which can handle the translation.

In any case, since you prefer the underspecified representation then go
ahead and keep using it.

--Sean
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help