Re: [PATCH v3 1/4] block: Add concurrent positioning ranges support
From: Christoph Hellwig <hch@infradead.org>
Date: 2021-08-10 08:24:06
Also in:
linux-block, linux-scsi
On Mon, Jul 26, 2021 at 10:38:03AM +0900, Damien Le Moal wrote:
quoted hunk ↗ jump to hunk
The Concurrent Positioning Ranges VPD page (for SCSI) and Log (for ATA) contain parameters describing the number of sets of contiguous LBAs that can be served independently by a single LUN multi-actuator disk. This patch provides the blk_queue_set_cranges() function allowing a device driver to signal to the block layer that a disk has multiple actuators, each one serving a contiguous range of sectors. To describe the set of sector ranges representing the different actuators of a device, the data type struct blk_cranges is introduced. For a device with multiple actuators, a struct blk_cranges is attached to the device request queue by the blk_queue_set_cranges() function. The function blk_alloc_cranges() is provided for drivers to allocate this structure. The blk_cranges structure contains kobjects (struct kobject) to register with sysfs the set of sector ranges defined by a device. On initial device scan, this registration is done from blk_register_queue() using the block layer internal function blk_register_cranges(). If a driver calls blk_queue_set_cranges() for a registered queue, e.g. when a device is revalidated, blk_queue_set_cranges() will execute blk_register_cranges() to update the queue sysfs attribute files. The sysfs file structure created starts from the cranges sub-directory and contains the start sector and number of sectors served by an actuator, with the information for each actuator grouped in one directory per actuator. E.g. for a dual actuator drive, we have: $ tree /sys/block/sdk/queue/cranges/ /sys/block/sdk/queue/cranges/ |-- 0 | |-- nr_sectors | `-- sector `-- 1 |-- nr_sectors `-- sector For a regular single actuator device, the cranges directory does not exist. Device revalidation may lead to changes to this structure and to the attribute values. When manipulated, the queue sysfs_lock and sysfs_dir_lock are held for atomicity, similarly to how the blk-mq and elevator sysfs queue sub-directories are protected. The code related to the management of cranges is added in the new file block/blk-cranges.c. Signed-off-by: Damien Le Moal <redacted> --- block/Makefile | 2 +- block/blk-cranges.c | 295 +++++++++++++++++++++++++++++++++++++++++ block/blk-sysfs.c | 13 ++ block/blk.h | 3 + include/linux/blkdev.h | 29 ++++ 5 files changed, 341 insertions(+), 1 deletion(-) create mode 100644 block/blk-cranges.cdiff --git a/block/Makefile b/block/Makefile index bfbe4e13ca1e..e477e6ca9ea6 100644 --- a/block/Makefile +++ b/block/Makefile@@ -9,7 +9,7 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-sysfs.o \ blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \ blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \ - disk-events.o + disk-events.o blk-cranges.o obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.odiff --git a/block/blk-cranges.c b/block/blk-cranges.c new file mode 100644 index 000000000000..deaa09e564f7 --- /dev/null +++ b/block/blk-cranges.c@@ -0,0 +1,295 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Block device concurrent positioning ranges. + * + * Copyright (C) 2021 Western Digital Corporation or its Affiliates. + */ +#include <linux/kernel.h> +#include <linux/blkdev.h> +#include <linux/slab.h> +#include <linux/init.h> + +#include "blk.h" + +static ssize_t blk_crange_sector_show(struct blk_crange *cr, char *page) +{ + return sprintf(page, "%llu\n", cr->sector); +} + +static ssize_t blk_crange_nr_sectors_show(struct blk_crange *cr, char *page) +{ + return sprintf(page, "%llu\n", cr->nr_sectors); +} + +struct blk_crange_sysfs_entry { + struct attribute attr; + ssize_t (*show)(struct blk_crange *cr, char *page); +}; + +static struct blk_crange_sysfs_entry blk_crange_sector_entry = { + .attr = { .name = "sector", .mode = 0444 }, + .show = blk_crange_sector_show, +}; + +static struct blk_crange_sysfs_entry blk_crange_nr_sectors_entry = { + .attr = { .name = "nr_sectors", .mode = 0444 }, + .show = blk_crange_nr_sectors_show, +}; + +static struct attribute *blk_crange_attrs[] = { + &blk_crange_sector_entry.attr, + &blk_crange_nr_sectors_entry.attr, + NULL, +}; +ATTRIBUTE_GROUPS(blk_crange); + +static ssize_t blk_crange_sysfs_show(struct kobject *kobj, + struct attribute *attr, char *page) +{ + struct blk_crange_sysfs_entry *entry; + struct blk_crange *cr; + struct request_queue *q; + ssize_t ret; + + entry = container_of(attr, struct blk_crange_sysfs_entry, attr); + cr = container_of(kobj, struct blk_crange, kobj); + q = cr->queue;
Nit: I'd find this a little eaier to read if the variables were initialized at declaration time: struct blk_crange_sysfs_entry *entry = container_of(attr, struct blk_crange_sysfs_entry, attr); struct blk_crange *cr = container_of(kobj, struct blk_crange, kobj); struct request_queue *q = cr->queue; (or maybe even don't bother with the q local variable).
+/*
+ * Dummy release function to make kobj happy.
+ */
+static void blk_cranges_sysfs_nop_release(struct kobject *kobj)
+{
+}How do we ensure the kobj is still alive while it is accessed?
+void blk_queue_set_cranges(struct gendisk *disk, struct blk_cranges *cr)
s/blk_queue/disk/
mutex_lock(&q->sysfs_lock);
+
+ ret = blk_register_cranges(disk, NULL);
+ if (ret) {
+ mutex_unlock(&q->sysfs_lock);
+ mutex_unlock(&q->sysfs_dir_lock);
+ kobject_del(&q->kobj);
+ blk_trace_remove_sysfs(dev);
+ kobject_put(&dev->kobj);
+ return ret;
+ }
+
if (q->elevator) {
ret = elv_register_queue(q, false);
if (ret) {
+ blk_unregister_cranges(disk);
mutex_unlock(&q->sysfs_lock);
mutex_unlock(&q->sysfs_dir_lock);
kobject_del(&q->kobj);Please use a goto instead of duplicating the code.