Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory
From: David Brownell <hidden>
Date: 2010-05-26 00:09:16
Also in:
linux-omap, lkml
Possibly related (same subject, not in this thread)
- 2010-05-25 · Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory · Dmitry Torokhov <dmitry.torokhov@gmail.com>
- 2010-05-25 · Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory · Kevin Hilman <hidden>
- 2010-05-25 · Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory · Dmitry Torokhov <dmitry.torokhov@gmail.com>
- 2010-05-25 · Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory · Kevin Hilman <hidden>
- 2010-05-19 · Re: [PATCH] touchscreen: ads7846: please don't touch free'd memory · Dmitry Torokhov <dmitry.torokhov@gmail.com>
--- On Tue, 5/25/10, Kevin Hilman <khilman@deeprootsystems.com> wrote:quoted
quoted
quoted
This is a load of crap. If probe() fails thatmeans that driver does notquoted
quoted
quoted
manage the device
And thus, dev->driver shouldn't be assigned ... That probe() looks very odd, lately, yes. It seems to have changed a lot over the past few years, and grown a few fault paths that don't behave right (that is, they don't clean up properly on failure, or don't report their faults)...
quoted
quoted
quoted
and thus ads7846_suspend()and ads7846_resume() shouldquoted
quoted
quoted
not be called _at all_.
Those being called is an indication that the probe() succeeded so the driver is bound to that device... If SPI core manages
to call random methods fromquoted
quoted
quoted
the drivers that failed to bind to a devicethat should be fixed in SPIquoted
quoted
quoted
core.
Not random methods. The appropriate ones ... for the case where probe() succeeded and the device has been properly set up. I suspect a bug was added when the driver binding got rewritten, whereby dev->driver was wrongly assigned on fault paths (where it should have been cleared). Of course that goes through slightly convoluted bits of driver model code... maybe there's no such bug, and everything is explained by probe() misbehaving.
quoted
quoted
Agreed, this is indeed a bug in the SPI drivercore.
That's not at all clear to me.
quoted
quoted
However, by fixing the SPI core to unregister thedriver on failurequoted
quoted
(patch below),
... as you noted, not a good patch "below". The issue seems to be on probe() paths, NOT as part of driver registration.
we prevent the suspend & resume methods from beingquoted
quoted
called, but the driver's ->remove() method willstill be called due toquoted
quoted
the driver_unregister(). So at least theremove() method needs fixingquoted
quoted
to prevent accessing free'd memory.
That doesn't seem right; if the issue is probe() misbehavior, fixing that makes that access issue go away. Don't change remove() etc. I see two issues. One is flakey ads7846 probe() behavior. Another is the response of the current SPI core code to that flakiness ... the report here seems to indicate that the driver is bound to the device even though it's not been properly set up ... unclear whether probe() is reporting a clean failure, I suspect not. (If it were reporting a clean failure, the device should end up with no driver bound.) It's very possible the probe() errors explain all the problems.
quoted
quoted
Unless there are objections, I'll post the belowalong with an updatedquoted
quoted
version of ads7846 patch. Kevindiff --git a/drivers/spi/spi.cb/drivers/spi/spi.cquoted
quoted
index b3a1f92..42d4d26 100644--- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c@@ -175,6 +175,8 @@ static voidspi_drv_shutdown(struct device *dev)quoted
quoted
*/ int spi_register_driver(struct spi_driver*sdrv)quoted
quoted
{ + int ret; + sdrv->driver.bus =&spi_bus_type;quoted
quoted
if (sdrv->probe)sdrv->driver.probe = spi_drv_probe;quoted
quoted
@@ -182,7 +184,12 @@ intspi_register_driver(struct spi_driver *sdrv)quoted
quoted
sdrv->driver.remove = spi_drv_remove;quoted
quoted
if (sdrv->shutdown)sdrv->driver.shutdown = spi_drv_shutdown;quoted
quoted
- returndriver_register(&sdrv->driver);quoted
quoted
+ + ret =driver_register(&sdrv->driver);quoted
quoted
+ if (!ret) +driver_unregister(&sdrv->driver);quoted
This is still wrong. driver_register() should properlyclean up afterquoted
itself and not require calls to driver_unregister()(and I believe itquoted
does).
Right ... driver registration should always be safe. The tricky bit would be a side effect: probing any devices which match that driver. Don't unregister drivers on error; just make sure they don't get wrongly bound to devices if probe() misbehaves.
quoted
Besides, I do not see how this patch changes anythingwith regard toquoted
binding devices and drivers. Even if driver_register()succeeds, bindingquoted
may still fail and we should not be calling neither->remove(), norquoted
->suspend()/->resume() for the devices thatfailed to be bound. Hmm, good point. After digging into the driver core and realizing that it seemed to have sane error handling itself,
Sane but somewhat convoluted. It can be tricky to figure out.
I took a closer look at ads7846_probe() and discovered it doesn't actually return an error code for certain failure cases! That was the root cause.
My conclusion too...
The patch below fixes the problem.
Did you do much of an audit, or just work to fix this specific problem? I thought the code was looking fairly goofy. Regulator additions were pretty recent, but it wasn't clear to me that it'd be the *only* source of such problems...
Thanks, Kevin From 3588494cf6e51754f7089bb8089b99abd765c493 Mon Sep 17 00:00:00 2001 From: Kevin Hilman <redacted> Date: Tue, 25 May 2010 14:38:04 -0700 Subject: [PATCH] input: touchscreen: ads7846: return error on probe failure In probe(), if regulator_get() failed, an error code was not being returned causing the driver to be successfully bound, even though probe failed. This in turn caused the suspend, resume and remove methods to be registered and accessed via the SPI core.
Or more simply: this was a nasty violation of the driver model, which lead to breakage. (The SPI core is just inheriting driver model/core behavior.) A driver that fails to properly bind to the device must report a probe() failure, ensuring that the driver is not recorded as being bound to the device...
quoted hunk ↗ jump to hunk
Since these functions all access private driver data using pointers that had been freed during the failed probe, this would lead to unpredictable behavior. This patch ensures that probe() returns an error code in this failure case so the driver is not bound. Found using lockdep and noticing the lock used in the suspend/resum path pointed to a bogus lock due to the freed memory. Signed-off-by: Kevin Hilman <redacted> --- drivers/input/touchscreen/ads7846.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)diff --git a/drivers/input/touchscreen/ads7846.cb/drivers/input/touchscreen/ads7846.c index 532279c..634f6f6 100644--- a/drivers/input/touchscreen/ads7846.c +++ b/drivers/input/touchscreen/ads7846.c@@ -1163,8 +1163,8 @@ static int __devinitads7846_probe(struct spi_device *spi) ts->reg = regulator_get(&spi->dev, "vcc"); if (IS_ERR(ts->reg)) { - dev_err(&spi->dev, "unable to get regulator: %ld\n",
Adding a requirement for a Vcc regulator seems likely to have broken lots of boards, FWIW ...
- PTR_ERR(ts->reg)); + err = PTR_ERR(ts->reg);
That error should be returned, yes. (one line not two; maybe the patch got wrapped somewhere.) I didn't check to make sure that was sufficient cleanup though. - Dave
+ dev_err(&spi->dev, "unable to get regulator: %ld\n", err); goto err_free_gpio; }