[PATCH v3] proc_pid_io.5: dewafflify
From: наб <hidden>
Date: 2024-03-17 13:31:23
Subsystem:
the rest · Maintainer:
Linus Torvalds
This page copies verbatim the contents of
Documentation/filesystems/proc.rst, added wholesale in
commit f9c99463b0cd05603d125c915e2886d55a686b82 ("[PATCH] Documentation
for io-accounting / reporting via procfs") in 2007.
As such, it mirrors the sensibilities of the time ‒
writing "successful read returns" as "data pulled from storage. actually
just the data the process gave to read(). this also means from non-regular
files! whether the data was pulled from storage doesn't matter actually
(obligatory cache mention)"
for the modern reader this is just a lot of waffling
(note also that processes give no data to read()!)
‒ and sensibilities of the sheepish implementer in kernel documentation ‒
"an attempt" for a well-defined kernel behaviour, mentioning the
"current implementation", consistent mentions of specific kernel-internal
caching mechanisms, "the big inaccuracy here".
Re-write to be more useful and less misleading as documentation;
the syscall enumeration is accurate for kernel v6.8, but the sysc? stats
are also bumped by kernel_{read,write}(), which is sometimes used by too
many syscalls in too many scenarios to usefully enumerate.
Signed-off-by: Ahelenia Ziemiańska <redacted>
---
Hi!
On Sun, Mar 17, 2024 at 01:15:18PM +0100, Alejandro Colomar wrote:On Sun, Mar 17, 2024 at 12:01:41PM +0100, наб wrote:quoted
-The number of bytes -which this task and its waited-for children -have caused to be read from storage. -This is simply the sum of bytes which this process passed to +The number of bytes returned by successfulIn this case, I think I prefer to break before "returned". What would you do?
this is a "meh" moment imo; in running text sure, maybe, but this is broken up by the .BR so it starts to devolve into 3-word-line territory which is worse
quoted
-Attempt to count the number of read I/O operations\[em]that is, -system calls such as +The number of "I/O read" system calls\[em]those from theFrom I/O, read only accounts for the I. :) Should we just say "read"?
Yeah, "I/O" is overloaded here; "file read" works better.
quoted
.BR read (2) +family +(including when invoked used by the kernel as part of other syscalls),This parenthesis being there seems to imply that if the kernel calls sendfile internally for $reasons (even if it doesn't at the moment), it won't be counted. I think it makes more sense at the end of the list, no?
Well, as-is it doesn't, and I reduced this to the narrowest definition I can prove, but I guess so, yes. Also just noticed "invoked used". man5/proc_pid_io.5 | 67 +++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 37 deletions(-)
diff --git a/man5/proc_pid_io.5 b/man5/proc_pid_io.5
index dc75a91de..7f840f3bb 100644
--- a/man5/proc_pid_io.5
+++ b/man5/proc_pid_io.5@@ -33,63 +33,56 @@ .SH DESCRIPTION .TP .IR rchar ": characters read" The number of bytes -which this task and its waited-for children -have caused to be read from storage. -This is simply the sum of bytes which this process passed to +returned by successful .BR read (2) and similar system calls. -It includes things such as terminal I/O and -is unaffected by whether or not actual -physical disk I/O was required (the read might have been satisfied from -pagecache). .TP .IR wchar ": characters written" The number of bytes -which this task and its waited-for children -have caused, or shall cause to be written to disk. -Similar caveats apply here as with -.IR rchar . +returned by successful +.BR write (2) +and similar system calls. .TP .IR syscr ": read syscalls" -Attempt to count the number of read I/O operations\[em]that is, -system calls such as +The number of "file read" system calls\[em]those from the .BR read (2) +family +.BR sendfile (2), +.BR copy_file_range (2), and -.BR pread (2). +.BR ioctl (2) +.BR BTRFS_IOC_ENCODED_READ [ _32 ] +(including when invoked by the kernel as part of other syscalls), .TP .IR syscw ": write syscalls" -Attempt to count the number of write I/O operations\[em]that is, -system calls such as +The number of "file write" system calls\[em]those from the .BR write (2) +family +.BR sendfile (2), +.BR copy_file_range (2), and -.BR pwrite (2). +.BR ioctl (2) +.BR BTRFS_IOC_ENCODED_WRITE [ _32 ] +(including when invoked by the kernel as part of other syscalls), .TP .IR read_bytes ": bytes read" -Attempt to count the number of bytes -which this process and its waited-for children -really did cause to be fetched from the storage layer. +The number of bytes really fetched from the storage layer. This is accurate for block-backed filesystems. .TP .IR write_bytes ": bytes written" -Attempt to count the number of bytes -which this process and its waited-for children -caused to be sent to the storage layer. +The number of bytes really sent to the storage layer. .TP .IR cancelled_write_bytes : -The big inaccuracy here is truncate. -If a process writes 1 MB to a file and then deletes the file, -it will in fact perform no writeout. -But it will have been accounted as having caused 1 MB of write. -In other words: -this field represents the number of bytes -which this process and its waited-for children -caused to not happen, by truncating pagecache. -A task can cause "negative" I/O too. -If this task truncates some dirty pagecache, -some I/O which another task has been accounted for -(in its -.IR write_bytes ) -will not be happening. +The above statistics fail to account for truncation: +if a process writes 1 MB to a regular file and then removes it, +said 1 MB will not be written, but +.I will +have nevertheless been accounted as a 1 MB write. +This field represents the number of bytes "saved" from I/O writeback. +This can yield to having done negative I/O +if caches dirtied by another process are truncated. +This figure applies to I/O already accounted-for by +.IR write_bytes . .RE .IP .IR Note :
--
2.39.2Attachments
- signature.asc [application/pgp-signature] 833 bytes