Thread (26 messages) 26 messages, 5 authors, 2020-05-15

Re: [PATCH v2] serial: 8250_dw: Fix common clocks usage race condition

From: Sergey Semin <hidden>
Date: 2020-03-25 17:11:25
Also in: linux-clk, linux-serial, lkml

On Tue, Mar 24, 2020 at 12:12:43PM +0200, Andy Shevchenko wrote:
On Mon, Mar 23, 2020 at 04:50:17PM +0300, Sergey Semin wrote:
quoted
On Mon, Mar 23, 2020 at 11:01:09AM +0100, Maxime Ripard wrote:
quoted
On Mon, Mar 23, 2020 at 05:46:09AM +0300, Sergey.Semin@baikalelectronics.ru wrote:
quoted
From: Serge Semin <redacted>

There are races possible in the dw8250_set_termios() callback method
and while the device is in PM suspend state. A race condition may
happen if the baudrate clock source device is shared with some other
device (in our machine it's another DW UART port). In this case if that
device changes the clock rate while serial console is using it the
DW 8250 UART port might not only end up with an invalid uartclk value
saved, but may also experience a distorted output data since baud-clock
could have been changed. In order to fix this lets enable an exclusive
reference clock rate access in case if "baudclk" device is specified.

So if some other device also acquires the rate exclusivity during the
time of a DW UART 8250 port being opened, then DW UART 8250 driver
won't be able to alter the baud-clock. It shall just use the available
clock rate. Similarly another device also won't manage to change the
rate at that time. If nothing else have the exclusive rate access
acquired except DW UART 8250 driver, then the driver will be able to
alter the rate as much as it needs to in accordance with the currently
implemented logic.
quoted
quoted
clk_rate_exclusive_get is pretty intrusive, and due to the usual
topology of clock trees, this will lock down 3-4 parent clocks to
their current rate as well. In the Allwinner SoCs case for example,
this will lock down the same PLL than the one used by the CPU,
preventing cpufreq from running.
Speaking about weak design of a SoC' clock tree. Our problems are nothing
with respect to the Allwinner SoC, in which case of changing the
CPU-frequency may cause the UART glitches subsequently causing data
transfer artefacts.) Moreover as I can see the same issue may raise for
I2C, QSPI, PWM devices there.

Anyway your concern does make sense.
quoted
However, the 8250 has a pretty wide range of dividers and can adapt to
any reasonable parent clock rate, so we don't really need to lock the
rate either, we can simply react to a parent clock rate change using
the clock notifiers, just like the SiFive UART is doing.

I tried to do that, but given that I don't really have an extensive
knowledge of the 8250, I couldn't find a way to stop the TX of chars
while we change the clock rate. I'm not sure if this is a big deal or
not, the SiFive UART doesn't seem to care.
Yes, your solution is also possible, but even in case of stopping Tx/Rx it
doesn't lack drawbacks. First of all AFAIK there is no easy way to just
pause the transfers. We'd have to first wait for the current transfers
to be completed, then somehow lock the port usage (both Tx and Rx
traffic), permit the reference clock rate change, accordingly adjust the
UART clock divider, and finally unlock the port. While if we don't mind
to occasionally have UART data glitches, we can just adjust the UART ref
divider synchronously with ref clock rate change as you and SiFive UART
driver suggest.

So we are now at a zugzwang - a fork to three not that good solutions:
1) lock the whole clock branch and provide a glitchless interfaces. But
by doing so we may (in case of Allwinner SoCs we will) lockup some very
important functionality like CPU-frequency change while the UART port is
started up. In this case we won't have the data glitches.
2) just adjust the UART clock divider in case of reference clock rate
change (use the SiFive UART driver approach). In this case we may have the
data corruption.
3) somehow implement the algo: wait for the transfers to be completed,
lock UART interface (it's possible for Tx, but for Rx in case of no handshake
enabled it's simply impossible), permit the ref clock rate change,
adjust the UART divider, then unlock the UART interface. In this case the data
glitches still may happen (if no modem control is available or
handshakes are disabled).

As for the cases of Baikal-T1 UARTs the first solutions is the most suitable.
We don't lock anything valuable, since a base PLL output isn't directly
connected to any device and it's rate once setup isn't changed during the
system running. On the other hand I don't mind to implement the second
solution, even though it's prone to data glitches. Regarding the solution
3) I won't even try. It's too complicated, I don't have time and
test-infrastructure for this.

So Andy what do you think?
From Intel HW perspective the first two are okay, but since Maxime is against
first, you have the only option from your list. Perhaps somebody may give
option 4) here...
Ok then. I'll implement the option 2) in v3 if noone gives any alternatives
before that.

Regards,
-Sergey
-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help