Thread (14 messages) 14 messages, 3 authors, 2010-02-19

Re: [PATCH] input: polldev can cause crash in case of polling disabled

From: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: 2010-02-16 21:48:42

On Tue, Feb 16, 2010 at 07:37:49PM +0100, samu.p.onkalo@nokia.com wrote:
quoted
-----Original Message-----
From: ext Dmitry Torokhov [mailto:dmitry.torokhov@gmail.com]
Sent: 16 February, 2010 19:51
To: Onkalo Samu.P (Nokia-D/Tampere)
Cc: linux-input@vger.kernel.org
Subject: Re: [PATCH] input: polldev can cause crash in case of polling
disabled

Hi Samu,

On Tue, Feb 16, 2010 at 04:44:41PM +0200, Samu Onkalo wrote:
quoted
If polling is set to disabled value and polled input device
is opened and closed several times, address to workqueue will probably
change at some point. Since nothing is queued (due to polled disabled
state), content of the work struct contains pointer to the old and
non-existent
quoted
workqueue.
This I do not quite understand. The work struct as far as I can see does
not reference workqueue at all. There is a list entry but if we do not
poll the device that entry should be always detached from any lists. We
properly initialize WQ entry when we create the device and it shoudl
remain valid until the device is destroyed.
'data' entry contains a pointer to per-cpu-workqueue which in turn contains
the workqueue pointer. This 'data' entry is not ok in case of failure. I can
Collect more data about this.
quoted
quoted
When the device is closed again, cancel_delayed_work_sync
goes crazy due to pointer to nonexisting workqueue.
What kind of failure do you see? Is there a stack trace or something?
Kernel panic while in workqueue handling (paging fault with some crappy address).
quoted
quoted
In case on disabled polling, init work struct to initial value to
clean up the old values.
Also, why would not we see the same issue with enabled polling? The
workqueue is being created and destroyed in this case as well.

Queue_delayed_work updates the work struct. Workqueue itself is ok.

I think that the sequence goes about this way (no other polled devices open):
1. Polled device is opened with polling enabled
2. It first creates workqueue and then queue the first polling. Kernels
Workqueue functions updates current workqueue information to the work-struct
3. polled device is closed
4. workqueue is destroyed

5. polling interval is set to 0
6. device is reopened
7. New workqueue is created
8. polled device is closed without queueing a work
9. work struct for polled device contains pointer to the old (created in 2.) wq
10. cancel_workqueue... can access unallocated memory causing crash.
Ah, I see. In this case I think it should be fixed in workqueue code by
clearing work data so it does not point to the [potentially] non-existing
workqueue when we cancel or complete work.

Oleg, do you agree?

-- 
Dmitry
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help