RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
From: Joshua Quesenberry <hidden>
Date: 2021-06-23 17:34:15
Hey! I have attached config.txt so you all can see what I'm doing. I added printing the error number as Marc suggested and the number appears to be -110 every time. [ 25.660006] CAN device driver interface [ 25.668720] spi_master spi0: will run message pump with realtime priority [ 25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized. [ 25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized. [ 28.098033] mcp251xfd spi0.1 rename4: renamed from can0 [ 28.175644] mcp251xfd spi0.0 can0: renamed from can1 [ 28.225891] mcp251xfd spi0.1 can1: renamed from rename4 [ 146.964971] mcp251xfd spi0.0: SPI transfer timed out [ 146.965023] spi_master spi0: failed to transfer one message from queue (ret=-110) [ 146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000). [ 146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying. [ 146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000). [ 146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check. Regarding the discussion about Kconfig flags, I went ahead and rebuilt kernel 5.10.44 using a config that was essentially arch/arm/configs/bcm2711_defconfig with these additions needed to get our I2S working. This should have undone the switch to ONDEMAND governor and enabling 1000 Hz clock. 1030a1031
CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
1040a1042
CONFIG_SND_SOC_WM8782=m
My RPi and HAT have worked very reliably with the older buster image and customized (same tweaks as mentioned in last email) kernel 4.19.73, in that kernel I'm using MCP25XXFD driver from msperl which under 5.10.Y kernel is having issues too. I only upgraded everything on my system at the end of last week, so hardware has been OK very recently. Keep in mind I'm not seeing a total failure, I do occasionally see everything work correctly and I can run the ip link setup command without issue, it's just not common and it seems fully removing power from the system and reapplying seems to help, but not every time, so maybe it's a coincidence. It could be an issue of subsequent configurations of the controller after the initial setup on power application, but I'd expect it work after every power yank I think. I wouldn't feel comfortable reverting my /boot/config.txt to a stock one and a default setup of the 40-pin header, at least not with my HAT attached which includes the CAN controllers AND circuitry to supply power to RPi from a 12V rail. Thanks, Josh Q -----Original Message----- From: Patrick Menschel <redacted> Sent: Wednesday, June 23, 2021 1:24 AM To: Joshua Quesenberry <redacted>; Marc Kleine-Budde <mkl@pengutronix.de> Cc: kernel@pengutronix.de; linux-can@vger.kernel.org Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y Am 23.06.21 um 04:59 schrieb Joshua Quesenberry:
Thank you Marc, I had tried finding a Linux CAN forum, but unfortunately searching for "CAN" in Google is about the most unhelpful search term one could use... so thanks for replying and getting me to a more appropriate audience. Reverting my system back to where CAN was working will probably be challenging. Our main goal was to get Boot from USB on the RPi enabled, but this unfortunately meant upgrading every piece of software and firmware available... previously we were still on Buster, but the OS snapshot was from Spring 2020 (if not Fall/Winter 2019), if not earlier, the firmware was much older, and the kernel was 4.19.73, wherein the MCP251XFD driver didn't exist yet. So getting back there will mean throwing a saved SD Card image on from Spring 2020 and then trying to figure out how to force downgrade the firmware. A colleague started this upgrade process for another project and was seeing these same results on two separate RPi, he did the OS and firmware upgrades, but I did the building of the 5.10.17 kernel. So including those two RPi and mine, that's three total systems with mostly non-working CAN where it had been working fine, my system has slightly newer RPi firmware now and the 5.10.44 kernel, the hope was maybe I'd pick up a patch somewhere, but no such luck. If you still think it would be beneficial to go through the effort of downgrading everything to verify the hardware I can do that, but just want to make sure before I start that since it'll take a while. I updated spi.c to include printing the error number as you requested and that's all baking now. When I get into work in the morning (US EST) I'll get the changes deployed and try it out. Since this issue is a very high failure rate, getting a log shouldn't be an issue. Some background on the custom kernel... when I switched to the 5.10.Y branch, I used arch/arm/configs/bcm2711_defconfig as my base config and then switched on preempt, switched to 1000Hz kernel timer, switched the default governor from powersave to ondemand, switched on debug flag (CONFIG_DEBUG_USER=y), enabled a few different CAN drivers we may encounter, and enabled some stuff for the WM8782 I2S chip. I probably should have recreated my config after 5.10.44, but I hadn't considered till this writing, looking at this diff there a few bits that are new I probably could benefit from including, but I don't see anything that I'd be concerned about. `diff bcm2711_defconfig hel_bcm2711_lowlatency_defconfig` 15d14 < CONFIG_ATA=m 43d41 < CONFIG_BH1750=m 53c51 < CONFIG_BLK_DEV_NVME=y ---quoted
CONFIG_BLK_DEV_NVME=m120c118 < CONFIG_CAN_J1939=m ---quoted
CONFIG_CAN_KVASER_USB=m123a122,123quoted
CONFIG_CAN_MCP25XXFD=m CONFIG_CAN_PEAK_USB=m127d126 < CONFIG_CCS811=m 155c154 < CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE=y ---quoted
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y158,159c157 < CONFIG_CPU_FREQ_GOV_ONDEMAND=y < CONFIG_CPU_FREQ_GOV_PERFORMANCE=y ---quoted
CONFIG_CPU_FREQ_GOV_POWERSAVE=y184a183quoted
CONFIG_DEBUG_USER=y209d207 < CONFIG_DRM_PANEL_JDI_LT070ME05000=m 319a318quoted
CONFIG_GENERIC_PHY=y325d323 < CONFIG_GPIO_PCA953X_IRQ=y 395a394quoted
CONFIG_HZ_1000=y561d559 < CONFIG_IR_TOY=m 826d823 < CONFIG_NF_LOG_ARP=m 828d824 < CONFIG_NF_LOG_NETDEV=m 950c946 < CONFIG_PREEMPT_VOLUNTARY=y ---quoted
CONFIG_PREEMPT=y957d952 < CONFIG_QCA7000_UART=m 994d988 < CONFIG_RPI_POE_POWER=m 1040a1035quoted
# CONFIG_RTC_HCTOSYS is not set1044,1045d1038 < CONFIG_SATA_AHCI=m < CONFIG_SATA_MV=m 1054d1046 < CONFIG_SENSIRION_SGP30=m 1134a1127quoted
CONFIG_SND_RPI_I2S_AUDIO_WM8782=m1149a1143quoted
CONFIG_SND_SOC_WM8782=mThe /boot/config.txt I included in the forum posts mentioned is tweaking the 40-pin header quite a bit from the default setup, we're using many of the pins for our HAT and planned for possibly adding more in the future.
Hi, it would help to find a reference to that config.txt . Regarding the changed Kconfig flags, I would suspect everything that owns a =y to be the culprit, especially everything that has connections to a clock. Ever since the first rpi3, clocks are unreliable in general due to the frequency governor. The rpi guys did there best to get rid of most of the initial problems but the root cause remains. The interesting question is, does a stock raspbian buster work with your hardware and that config.txt? I'm running a stock raspbian buster on a rpi3b+ with seeed can fd hat v2 24/7 for a couple of month now and did not expierence any problems. Regards, Patrick
Attachments
- config.txt [text/plain] 2800 bytes · preview