Thread (4 messages) 4 messages, 4 authors, 2021-06-14

Re: [PATCH v2] btrfs: sysfs: export dev stats in devinfo directory

From: David Sterba <hidden>
Date: 2021-06-14 17:22:34

On Mon, Jun 14, 2021 at 09:00:50PM +0800, Anand Jain wrote:
quoted
The stats are all in one file as it's the snapshot of all available
stats. The 'one value per file' is not very suitable here. The stats
should be valid right after the stats item is read from disk, shortly
after initializing the device.

In case the stats are not yet valid, print just 'invalid' as the file
contents.

Signed-off-by: David Sterba <dsterba@suse.com>
---

v2:
- print 'invalid' separtely and not among the values
- rename file name to 'error_stats' to leave 'stats' available for any
   other kind of stats we'd like in the future

  fs/btrfs/sysfs.c | 29 +++++++++++++++++++++++++++++
  1 file changed, 29 insertions(+)
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 4b508938e728..ebde1d09e686 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1495,7 +1495,36 @@ static ssize_t btrfs_devinfo_writeable_show(struct kobject *kobj,
  }
  BTRFS_ATTR(devid, writeable, btrfs_devinfo_writeable_show);
  
+static ssize_t btrfs_devinfo_error_stats_show(struct kobject *kobj,
+		struct kobj_attribute *a, char *buf)
+{
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	if (!device->dev_stats_valid)
+		return scnprintf(buf, PAGE_SIZE, "invalid\n");
and

Nit:
Instead of invalid, IMO, not yet valid is closer to what this error 
implies. And matches with the ioctl Warning message.

7743         btrfs_warn(fs_info, "get dev_stats failed, not yet valid");
Yeah I was not sure about that, 'invalid' does not have the same
meaning.  I'd like something that's short and easily parseable so no
long sentences or multi-word value like 'not yet valid'.

It's trivial to change if somebody has a better idea or convince me with
some variation on not-yet-valid.
quoted
+
+	/*
+	 * Print all at once so we get a snapshot of all values from the same
+	 * time. Keep them in sync and in order of definition of
+	 * btrfs_dev_stat_values.
+	 */
+	return scnprintf(buf, PAGE_SIZE,
+		"write_errs %d\n"
+		"read_errs %d\n"
+		"flush_errs %d\n"
+		"corruption_errs %d\n"
+		"generation_errs %d\n",
+		btrfs_dev_stat_read(device, BTRFS_DEV_STAT_WRITE_ERRS),
+		btrfs_dev_stat_read(device, BTRFS_DEV_STAT_READ_ERRS),
+		btrfs_dev_stat_read(device, BTRFS_DEV_STAT_FLUSH_ERRS),
+		btrfs_dev_stat_read(device, BTRFS_DEV_STAT_CORRUPTION_ERRS),
+		btrfs_dev_stat_read(device, BTRFS_DEV_STAT_GENERATION_ERRS));
+}
+BTRFS_ATTR(devid, error_stats, btrfs_devinfo_error_stats_show);
+
  static struct attribute *devid_attrs[] = {
+	BTRFS_ATTR_PTR(devid, error_stats),
  	BTRFS_ATTR_PTR(devid, in_fs_metadata),
  	BTRFS_ATTR_PTR(devid, missing),
  	BTRFS_ATTR_PTR(devid, replace_target),

write_errs 0
read_errs 0
flush_errs 0
corruption_errs 1
generation_errs 0


Another option was to print all errors in a single line. A single line
will help continuous monitoring of the error_stats. But it is not a big
deal.
I think that's just a matter of transforming the data to the convenient
form. Line based stats values is more common and grep-friendly. For a
monitoring tool/scripts that want it in one line it's perhaps a simple

  cat devinfo/error_sats | tr '\n' ' '

Converting from one line to multiple lines is not that trivial.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help