Thread (7 messages) 7 messages, 3 authors, 2023-03-23

Re: /proc/PID/io/read_bytes accounting regression?

From: David Wysochanski <hidden>
Date: 2023-02-17 14:09:58

On Fri, Feb 17, 2023 at 6:56 AM Daire Byrne [off-list ref] wrote:
Hi,

Maybe someone here can quickly point me in the right direction for
this oddity that we noticed.

On newer kernels, it looks like the task io accounting is not
incrementing the read_bytes when reading from a NFS mount? This was
definitely working on v5.16 downwards, but has not been working since
v5.18 up to v6.2 (I haven't tested v5.17 yet).

If I read from a local filesystem, then the read_bytes for that PID is
incremented as expected.

If I read over NFS using directIO, then the read_bytes is also
correctly incremented for that PID. It's just when reading normally
without directIO that it is not.

The write_bytes and rchar are also still both correct in all situations.

I have checked the kernel config and I'm fairly sure I have all the
right things enabled:

CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_TASKSTATS=y

Unless there was some extra config introduced specific to the nfs
client in later kernels that I missed?

Cheers,

Daire
Daire,
Thanks for the report.

Willy,
Question for you at the bottom of this.

First, here's what looks to be the candidate changes between these versions:

$ git log --oneline v5.16..v5.18 fs/nfs/read.c
89c2be8a9516 NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS
fc1c5abfca7e NFS: Rename fscache read and write pages functions
8786fde8421c Convert NFS from readpages to readahead
16f2f4e679cf nfs: Implement cache I/O by accessing the cache directly

I would be this  is due to this patch, which went into 5.18:
8786fde8421c Convert NFS from readpages to readahead


And the hunks that now call readahead_page vs read_cache_pages:
@@ -397,14 +396,16 @@ int nfs_readpage(struct file *file, struct page *page)
        return ret;
 }

-int nfs_readpages(struct file *file, struct address_space *mapping,
-               struct list_head *pages, unsigned nr_pages)
+void nfs_readahead(struct readahead_control *ractl)
 {
+       unsigned int nr_pages = readahead_count(ractl);
+       struct file *file = ractl->file;
        struct nfs_readdesc desc;
-       struct inode *inode = mapping->host;
+       struct inode *inode = ractl->mapping->host;
+       struct page *page;
        int ret;

-       trace_nfs_aop_readahead(inode, lru_to_page(pages), nr_pages);
+       trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages);
        nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);

        ret = -ESTALE;
@@ -422,14 +423,18 @@ int nfs_readpages(struct file *file, struct
address_space *mapping,
        nfs_pageio_init_read(&desc.pgio, inode, false,
                             &nfs_async_read_completion_ops);

-       ret = read_cache_pages(mapping, pages, readpage_async_filler, &desc);
+       while ((page = readahead_page(ractl)) != NULL) {
+               ret = readpage_async_filler(&desc, page);
+               put_page(page);
+               if (ret)
+                       break;
+       }

        nfs_pageio_complete_read(&desc.pgio);

        put_nfs_open_context(desc.ctx);
 out:
        trace_nfs_aop_readahead_done(inode, nr_pages, ret);
-       return ret;
 }

and this hunk:
@@ -397,14 +396,16 @@ int nfs_readpage(struct file *file, struct page *page)
        return ret;
 }

-int nfs_readpages(struct file *file, struct address_space *mapping,
-               struct list_head *pages, unsigned nr_pages)
+void nfs_readahead(struct readahead_control *ractl)
 {
+       unsigned int nr_pages = readahead_count(ractl);
+       struct file *file = ractl->file;
        struct nfs_readdesc desc;
-       struct inode *inode = mapping->host;
+       struct inode *inode = ractl->mapping->host;
+       struct page *page;
        int ret;

-       trace_nfs_aop_readahead(inode, lru_to_page(pages), nr_pages);
+       trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages);
        nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);

        ret = -ESTALE;
@@ -422,14 +423,18 @@ int nfs_readpages(struct file *file, struct
address_space *mapping,
        nfs_pageio_init_read(&desc.pgio, inode, false,
                             &nfs_async_read_completion_ops);

-       ret = read_cache_pages(mapping, pages, readpage_async_filler, &desc);
+       while ((page = readahead_page(ractl)) != NULL) {
+               ret = readpage_async_filler(&desc, page);
+               put_page(page);
+               if (ret)
+                       break;
+       }

        nfs_pageio_complete_read(&desc.pgio);

        put_nfs_open_context(desc.ctx);
 out:
        trace_nfs_aop_readahead_done(inode, nr_pages, ret);
-       return ret;
 }

 int __init nfs_init_readpagecache(void)



In v5.16 we had this call to task_io_account_read(PAGE_SIZE); on line 109
of read_cache_pages();
76 /**
 77  * read_cache_pages - populate an address space with some pages &
start reads against them
 78  * @mapping: the address_space
 79  * @pages: The address of a list_head which contains the target
pages.  These
 80  *   pages have their ->index populated and are otherwise uninitialised.
 81  * @filler: callback routine for filling a single page.
 82  * @data: private data for the callback routine.
 83  *
 84  * Hides the details of the LRU cache etc from the filesystems.
 85  *
 86  * Returns: %0 on success, error return by @filler otherwise
 87  */
 88 int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 89                         int (*filler)(void *, struct page *), void *data)
 90 {
 91         struct page *page;
 92         int ret = 0;
 93
 94         while (!list_empty(pages)) {
 95                 page = lru_to_page(pages);
 96                 list_del(&page->lru);
 97                 if (add_to_page_cache_lru(page, mapping, page->index,
 98                                 readahead_gfp_mask(mapping))) {
 99                         read_cache_pages_invalidate_page(mapping, page);
100                         continue;
101                 }
102                 put_page(page);
103
104                 ret = filler(data, page);
105                 if (unlikely(ret)) {
106                         read_cache_pages_invalidate_pages(mapping, pages);
107                         break;
108                 }
109                 task_io_account_read(PAGE_SIZE);

But there's no call to task_io_account_read() anymore in the new
readahead code paths that I could tell,
but maybe I'm missing something.

Willy,
Does each caller of readahead_page() now need to call
task_io_account_read() or should we add that into
readahead_page() or maybe inside read_pages()?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help