Re: [bug-report] task info hung problem in fb_deferred_io_work()
From: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Date: 2024-04-30 11:45:37
Also in:
dri-devel, lkml
From: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Date: 2024-04-30 11:45:37
Also in:
dri-devel, lkml
On Fri, Apr 19, 2024 at 5:34 PM Nam Cao [off-list ref] wrote:
On 2024-04-19 Patrik Jakobsson wrote:quoted
Neither cancel_delayed_work_sync() or flush_delayed_work() prevent new work from being scheduled after they return.flush_delayed_work() is called during device closing. And because no writes are performed after the device has been closed, no new work should be queued after flush_delayed_work().
Yes, nothing should write after the device is closed but the events are asynchronous so in theory the order is not guaranteed. I also find it unlikely but I have no other theory at this point.
quoted
But cancel_delayed_work_sync() at least makes sure the queue is empty so the problem becomes less apparent. Could this explain what we're seeing?I suspect that cancel_delayed_work_sync() is only treating the symptoms by preventing the deferred work from running. The real bug is "someone" giving fb_deferred_io_work() invalid pages to work with. But that's just a blind guess.
Trying to figure out when the page goes away in relation to when the work is triggered might be a good place to start.
Best regards, Nam