Re: [PATCH] serial: qcom_geni_serial: Fix recent kdb hang
From: Akash Asthana <hidden>
Date: 2020-08-10 12:32:33
Also in:
linux-arm-msm, lkml
Hi Doug, On 8/7/2020 10:49 AM, Douglas Anderson wrote:
The commit e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work
even if UART isn't console") worked pretty well and I've been doing a
lot of debugging with it. However, recently I typed "dmesg" in kdb
and then held the space key down to scroll through the pagination. My
device hung. This was repeatable and I found that it was introduced
with the aforementioned commit.
It turns out that there are some strange boundary cases in geni where
in some weird situations it will signal RX_LAST but then will put 0 in
RX_LAST_BYTE. This means that the entire last FIFO entry is valid.IMO that means we received a word in RX_FIFO and it is the last word hence RX_LAST bit is set. RX_LAST_BYTE is 0 means none of the bytes are valid in the last word. In such scenario we should just read RX_FIFO buffer (to empty it), discard the word and return NO_POLL_CHAR. Something like below. --------------------------------------------------------------------------------------------------------------------------------------------------------- else private_data->poll_cached_bytes_cnt = 4; private_data->poll_cached_bytes = readl(uport->membase + SE_GENI_RX_FIFOn); } + if (!private_data->poll_cached_bytes_cnt) + return NO_POLL_CHAR; private_data->poll_cached_bytes_cnt--; ret = private_data->poll_cached_bytes & 0xff; ------------------------------------------------------------------------------------------------------------------------------------------------------------- Please let me know whether above code helps. I am not sure about what all scenario can leads to this behavior from hardware, I will try to get an answer from hardware team. Any error bit was set for SE_GENI_S_IRQ_STATUS & SE_GENI_M_IRQ_STATUS registers? I guess the hang was seen because *poll_cached_bytes_cnt* is unsigned int and it's value was 0, when it's decremented by 1 it's value become '4294967295' (very large) and dummy RX (0x00) would happen that many times before reading any actual RX transfers/bytes. Regards, Akash
quoted hunk ↗ jump to hunk
This weird corner case is handled in qcom_geni_serial_handle_rx() where you can see that we only honor RX_LAST_BYTE if RX_LAST is set _and_ RX_LAST_BYTE is non-zero. If either of these is not true we use BYTES_PER_FIFO_WORD (4) for the size of the last FIFO word. Let's fix kgdb. While at it, also use the proper #define for 4. Fixes: e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") Signed-off-by: Douglas Anderson <dianders@chromium.org> --- drivers/tty/serial/qcom_geni_serial.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 07b7b6b05b8b..e27077656939 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c@@ -361,11 +361,16 @@ static int qcom_geni_serial_get_char(struct uart_port *uport) return NO_POLL_CHAR; if (word_cnt == 1 && (status & RX_LAST)) + /* + * NOTE: If RX_LAST_BYTE_VALID is 0 it needs to be + * treated as if it was BYTES_PER_FIFO_WORD. + */ private_data->poll_cached_bytes_cnt = (status & RX_LAST_BYTE_VALID_MSK) >> RX_LAST_BYTE_VALID_SHFT; - else - private_data->poll_cached_bytes_cnt = 4; + + if (private_data->poll_cached_bytes_cnt == 0) + private_data->poll_cached_bytes_cnt = BYTES_PER_FIFO_WORD; private_data->poll_cached_bytes = readl(uport->membase + SE_GENI_RX_FIFOn);
-- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project