Thread (10 messages) 10 messages, 3 authors, 2020-08-27

Re: [PATCH 2/3] fpga manager: xilinx-spi: provide better diagnostics on programming failure

From: Luca Ceresoli <luca@lucaceresoli.net>
Date: 2020-08-27 14:34:15
Also in: linux-fpga, linux-gpio, lkml

Hi Tom,

On 19/08/20 18:32, Luca Ceresoli wrote:
On 18/08/20 16:21, Tom Rix wrote:
quoted
On 8/18/20 3:20 AM, Luca Ceresoli wrote:
quoted
[a question for GPIO maintainers below]

Hi Tom,

thanks for your review!

On 17/08/20 20:15, Tom Rix wrote:
quoted
The other two patches are fine.

On 8/17/20 9:59 AM, Luca Ceresoli wrote:
quoted
When the DONE pin does not go high after programming to confirm programming
success, the INIT_B pin provides some info on the reason. Use it if
available to provide a more explanatory error message.

Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
---
 drivers/fpga/xilinx-spi.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c
index 502fae0d1d85..2aa942bb1114 100644
--- a/drivers/fpga/xilinx-spi.c
+++ b/drivers/fpga/xilinx-spi.c
@@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr,
 			return xilinx_spi_apply_cclk_cycles(conf);
 	}
 
-	dev_err(&mgr->dev, "Timeout after config data transfer.\n");
+	if (conf->init_b) {
+		int init_b_asserted = gpiod_get_value(conf->init_b);
gpiod_get_value can fail. So maybe need split the first statement.

init_b_asserted < 0 ? "invalid device"

As the if-else statement is getting complicated, embedding the ? : makes this hard to read.  'if,else if, else' would be better.
Thanks for the heads up. However I'm not sure which is the best thing to
do here.

First, I've been reading the libgpiod code after your email and yes, the
libgpiod code _could_ return runtime errors received from the gpiochip
driver, even though the docs state:
quoted
The get/set calls do not return errors because “invalid GPIO”> should have been reported earlier from gpiod_direction_*().
(https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html)

On the other hand there are plenty of calls to gpiod_get/set_value in
the kernel that don't check for error values. I guess this is because
failures getting/setting a GPIO are very uncommon (perhaps impossible
with platform GPIO).

When still a GPIO get/set operation fails I'm not sure adding thousands
of error-checking code lines in hundreds of drivers is the best way to
go. I feel like we should have a unique, noisy dev_err() in the error
path in libgpio but I was surprised in not finding any [1].

Linus, Bartosz, what's your opinion? Should all drivers check for errors
after every gpiod_[sg]et_value*() call?
My opinion is that you know the driver / hw is in a bad state and you

are trying to convey useful information.  So you should

be as careful as possible and not assume gpio did not fail.
This patch aims at providing better diagnostics after programming has
already gone bad. Neglecting an error might lead to a misleading error
message, but this doesn't lead programming to fail -- it has failed already.

On the other hand a gpiod_get/set_value() call might fail earlier, along
the normal execution path, and lead to real failures without an error
message emitted after the gpiod call that failed.

Which doesn't mean I'm against your proposal of adding error checking
code. Rather, if we want error checking, we want it mainly in other
places: at the very least at the first usage of each of the GPIOs, maybe
at each usage. Have a look at the beginning of
xilinx_spi_write_complete() [0] for example: if gpiod_get_value() fails
there the driver would think programming has been successfully completed
(DONE asserted). To me this is worse than just printing the wrong error
message.

[0]
https://elixir.bootlin.com/linux/v5.8.2/source/drivers/fpga/xilinx-spi.c#L114
I added error checking wherever gpiod_get_value() is called to see what
happens, and I'm sending a v2 series with this change. The code got
longer, but I've kept it still pretty readable. It still feels like a
half solution as gpiod_set_value() is void and thus no error checking
can be done on it, but let's see yours and other's opinion.

-- 
Luca



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help