Asynchronous read
From: Da Zheng <hidden>
Date: 2011-08-02 05:33:27
On 7/31/11 7:45 PM, Adam Cozzette wrote:
On Sun, Jul 31, 2011 at 03:58:55PM -0700, Da Zheng wrote:quoted
Hello, I'm trying to understand the read operation in VFS, and get confused by the asynchronous and synchronous operations. At the beginning, do_sync_read() invokes aio_read, which is generic_file_aio_read for ext4. generic_file_aio_read should be asynchronous read. But what really confuses me is do_generic_file_read, which is called by generic_file_aio_read. It seems to me do_generic_file_read implements synchronous read as this is the only function I can find that copy data to the user space by invoking the actor callback function. If do_generic_file_read is synchronous, how can generic_file_aio_read be asynchronous? In do_generic_file_read, if the data to be read isn't in the cache, normally page_cache_sync_readahead should be called. As far as I understand, when page_cache_sync_readahead returns, the pages will be ready in the cache, but the corresponding data in the disk isn't necessarily copied to the pages yet (because it eventually only invokes submit_bio to submit the IO requests to the block layer), so PageUptodate of the requested page might still return false, and then do_generic_file_read tries to invoke readpage to read the page again instead of waiting. Since the disk is always very slow, doesn't it just waste CPU time? Or do I miss something?
I have found the answer for this question. When a page isn't update to date, it will invoke lock_page_killable, which will sync the page before it returns.
This is a bit puzzling. I haven't figured it out but here are some things I came across as I was trying to solve the problem. First of all, this article might shine some light on the problem: http://lwn.net/Articles/170954/ Essentially, a few years ago there was a simplification of the API and aio_read and aio_write gained the ability to do vectored operations, making it possible to eliminate readv and writev. This even made it possible for drivers and filesystems to avoid implementing read() and write(), since the aio versions could take care of that. So my point is that I suspect that aio_read and aio_write are now often used in cases where they're not actually expected to be asynchronous, just because it simplifies the API to be able to reuse those functions for synchronous operations. In fact the LWN article says: Note that this change does not imply that asynchronous operations themselves must be supported - it is entirely permissible (if suboptimal) for aio_read() and aio_write() to operate synchronously at all times. So perhaps generic_file_aio_read is not actually asynchronous? My only other guess is that whatever it does happens fast enough to count as asynchronous.
Thanks. It makes sense. Da