Re: [PATCH] can: add Renesas R-Car CAN driver

From: Sergei Shtylyov <hidden>
Date: 2013-11-09 01:02:27
Also in: linux-can, linux-sh

Hello.

On 10/21/2013 11:12 PM, Wolfgang Grandegger wrote:

quoted

    Sorry for the belated reply -- was on vacations.

    And again sorry, couldn't get to this due to other things.

quoted

thanks for your contribution. The patch looks already quite good. Before
I find time for a detailed review could you please check error handling
and bus-off recovery by reporting the output of "$ candump -td -e
any,0:0,#FFFFFFFF" while sending messages to the device ...

[...]

quoted

root@10.0.0.101:/opt/can-utils# ip -details link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
qlen 10 link/can
can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
bitrate 297619 sample-point 0.714

Strange, what bitrate did you configure?

    300000.

quoted

tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 49999999

Could you please try if the algorithm works better with 50000000.

    It doesn't. Look at the logs below:

Bitrate setting:
----------------
actual:
300KHz
ip -details link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN qlen 10
link/can
can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
bitrate 297619 sample-point 0.714
tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 49999999

500KHz:
ip -details link show can0
2: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN qlen 10
link/can
can state STOPPED (berr-counter tx 0 rx 0) restart-ms 0
bitrate 499999 sample-point 0.800
tq 200 prop-seg 3 phase-seg1 4 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 49999999

700KHz:
ip -details link show can0
2: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN qlen 10
link/can
can state STOPPED (berr-counter tx 0 rx 0) restart-ms 0
bitrate 694444 sample-point 0.750
tq 180 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 49999999

--------------
Forced:
priv->can.clock.freq = 50000000;
300KHz:
ip -details link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN qlen 10
link/can
can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
bitrate 297619 sample-point 0.714
tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 50000000
root@10.0.0.101:/opt#

700KHz:
root@10.0.0.101:/opt# ip -details link show can0
2: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN qlen 10
link/can
can state STOPPED (berr-counter tx 0 rx 0) restart-ms 0
bitrate 694444 sample-point 0.750
tq 180 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
clock 50000000

quoted

root@10.0.0.101:/opt/can-utils#

quoted

dmesg:
rcar_can rcar_can.0 can0: bitrate error 0.7%
rcar_can rcar_can.0 can0: Error warning interrupt
rcar_can rcar_can.0 can0: Bus error interrupt:
rcar_can rcar_can.0 can0: ACK Error
rcar_can rcar_can.0 can0: Error passive interrupt

quoted

2. ... with short-circuited CAN high and low and doing some time later
         a manual recovery with "ip link set can0 type can restart"

quoted

    Now we have auto recovery only. Manual recovery was tested with the
first driver version and worked.

What do you mean with "auto recovery"? Auto recovery by the hardware or
via "restart-ms <ms>"? How do you choose between "manual" and "auto"
recovery?

    This exact test was done with hardware auto-recovery only. No "restart-ms" 
was programmed.

quoted
Terminal 1:

quoted

root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
root@10.0.0.104:/opt/can-utils#

quoted
Terminal 2:

quoted

root@10.0.0.104:/opt/can-utils# ./candump -td -e any,0:0,#FFFFFFFF
(000.000000) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
controller-problem{}
protocol-violation{{tx-dominant-bit-error}{}}
bus-error
(000.021147) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
controller-problem{}
bus-off
restarted-after-bus-off

Why does it get "restarted" directly after the bus-off?

    Because we have hardware auto-recovery enabled.

quoted

(011.738522) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
controller-problem{}

What controller problem? data[1] is not set for some reasom.

    Not comments. Looking into it.

quoted

protocol-violation{{tx-dominant-bit-error}{}}
bus-error
(000.021163) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
controller-problem{}
bus-off
restarted-after-bus-off
(001.666625) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
controller-problem{}
protocol-violation{{tx-dominant-bit-error}{}}
bus-error
(000.021157) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
controller-problem{}
bus-off
restarted-after-bus-off

quoted

dmesg:
rcar_can rcar_can.0 can0: Error warning interrupt
rcar_can rcar_can.0 can0: Error passive interrupt
rcar_can rcar_can.0 can0: Bus error interrupt:
rcar_can rcar_can.0 can0: Bit Error (dominant)
rcar_can rcar_can.0 can0: Error warning interrupt
rcar_can rcar_can.0 can0: Error passive interrupt

Why are they reported again. You are already in error passive.

    Don't know. :-/

[...]

quoted

I also wonder if the messages are always sent in order. You could use
the program "canfdtest" [1] from the can-utils for validation.

quoted

    This program is PITA. With the driver workaroung it works:

What workaround?

    Doesn't matter already, got rid of it.

Wolfgang.

WBR, Sergei

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help