Thread (23 messages) 23 messages, 6 authors, 2020-12-11

Re: [PATCH] block: add bio_iov_iter_nvecs for figuring out nr_vecs

From: Johannes Weiner <hannes@cmpxchg.org>
Date: 2020-12-03 22:38:57
Also in: linux-fsdevel

On Tue, Dec 01, 2020 at 01:32:26PM +0000, Christoph Hellwig wrote:
On Tue, Dec 01, 2020 at 01:17:49PM +0000, Pavel Begunkov wrote:
quoted
I was thinking about memcpy bvec instead of iterating as a first step,
and then try to reuse passed in bvec.

A thing that doesn't play nice with that is setting BIO_WORKINGSET in
__bio_add_page(), which requires to iterate all pages anyway. I have no
clue what it is, so rather to ask if we can optimise it out somehow?
Apart from pre-computing for specific cases...

E.g. can pages of a single bvec segment be both in and out of a working
set? (i.e. PageWorkingset(page)).
Adding Johannes for the PageWorkingset logic, which keeps confusing me
everytime I look at it.  I think it is intended to deal with pages
being swapped out and in, and doesn't make much sense to look at in
any form for direct I/O, but as said I'm rather confused by this code.
Correct, it's only interesting for pages under LRU management - page
cache and swap pages. It should not matter for direct IO.

The VM uses the page flag to tell the difference between cold faults
(empty cache startup e.g.), and thrashing pages which are being read
back not long after they have been reclaimed. This influences reclaim
behavior, but can also indicate a general lack of memory.

The BIO_WORKINGSET flag is for the latter. To calculate the time
wasted by a lack of memory (memory pressure), we measure the total
time processes wait for thrashing pages. Usually that time is
dominated by waiting for in-flight io to complete and pages to become
uptodate. These waits are annotated on the page cache side.

However, in some cases, the IO submission path itself can block for
extended periods - if the device is congested or submissions are
throttled due to cgroup policy. To capture those waits, the bio is
flagged when it's for thrashing pages, and then submit_bio() will
report submission time of that bio as a thrashing-related delay.

[ Obviously, in theory bios could have a mix of thrashing and
  non-thrashing pages, and the submission stall could have occurred
  even without the thrashing pages. But in practice we have locality,
  where groups of pages tend to be accessed/reclaimed/refaulted
  together. The assumption that the whole bio is due to thrashing when
  we see the first thrashing page is a workable simplification. ]

HTH
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help