Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)
From: Andrew Morton <akpm@linux-foundation.org>
Date: 2007-08-16 18:46:14
On Thu, 16 Aug 2007 22:20:06 +0400 Alex Tomas [off-list ref] wrote:
Andrew Morton wrote:quoted
quoted
quoted
But under this proposal, t_sync_datalist just gets removed: the new ordered-data mode _only_ need to do the sb->inode->page walk. So if I'm understanding you, the way in which we'd handle any such race is to make kjournald's writeback of the dirty pages block in lock_page(). Once it gets the page lock it can look to see if some other thread has mapped the page to disk.if I'm right holding number of pages locked, then they won't be locked, but writeback. of course kjournald can block on writeback as well, but how does it find pages with *newly allocated* blocks only?I don't think we'd want kjournald to do that. Even if a page was dirtied by an overwrite, we'd want to write it back during commit, just from a quality-of-implementation point of view. If we were to leave these pages unwritten during commit then a post-recovery file could have a mix of up-to-five-second-old data and up-to-30-seconds-old data.trying to implement this I've got to think that there is one significant difference between t_sync_datalist and sb->inode->page walk: t_sync_datalist is per-transaction. IOW, it doesn't change once transaction is closed. in contrast, nothing (currently) would prevent others to modify pages while commit is in progress.
That can happen at present - there's nothing to stop a process from modifying a page which is undergoing ordered-data commit-time writeout.