Thread (30 messages) 30 messages, 4 authors, 2021-03-18

Re: Errant readings on LM81 with T2080 SoC

From: Guenter Roeck <linux@roeck-us.net>
Date: 2021-03-08 00:31:51
Also in: linux-hwmon, linux-i2c, lkml

On 3/7/21 2:52 PM, Chris Packham wrote:
Hi,

I've got a system using a PowerPC T2080 SoC and among other things has 
an LM81 hwmon chip.

Under a high CPU load we see errant readings from the LM81 as well as 
actual failures. It's the errant readings that cause the most concern 
since we can easily ignore the read errors in our monitoring application 
(although it would be better if they weren't there at all).

I'm able to reproduce this with a test application[0] that artificially 
creates a high CPU load then by repeatedly checking for the all-1s 
values from the LM81 datasheet[1](page 17). The all-1s readings stick 
out as they are obviously higher than the voltage rails that are 
connected and disagree with measurements taken with a multimeter.

Here's the output from my device

[root@linuxbox ~]# cpuload 90&
[root@linuxbox ~]# (while true; do cat /sys/class/hwmon/hwmon0/in*_input 
| grep '3320\|4383\|6641\|15930\|3586'; sleep 1; done)&
3586
3586
cat: read error: No such device or address
cat: read error: No such device or address
3320
3320
3586
3586
6641
6641
4383
4383

Fundamentally I think this is a problem with the fact that the LM81 is 
an SMBus device but the T2080 (and other Freescale SoCs) uses i2c and we 
emulate SMBus. I suspect the errant readings are when we don't get round 
to completing the read within the timeout specified by the SMBus 
specification. Depending on when that happens we either fail the 
transfer or interpret the result as all-1s.
That is quite unlikely. Many sensor chips are SMBus chips connected to
i2c busses. It is much more likely that there is a bug in the T2080 i2c driver,
that the chip doesn't like the bulk read command issued through regmap, that
the chip has problems with the i2c bus speed, or that the i2c bus is noisy.

In this context, the "No such device or address" responses are very suspicious.
Those are reported by the i2c driver, not by the hwmon driver, and suggest
that the chip did not respond to a read request. Maybe it helps to enable
debugging to the i2c driver to see if it reports anything useful. Even
better might be to connect an i2c bus analyzer to the i2c bus and check
what is going on.

Guenter
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help