Thread (28 messages) 28 messages, 7 authors, 2018-12-05

Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support

From: Stefan Hajnoczi <stefanha@redhat.com>
Date: 2018-06-12 16:05:43

On Mon, Jun 11, 2018 at 03:37:00AM +0000, Liu, Changpeng wrote:
quoted
-----Original Message-----
From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
Sent: Friday, June 8, 2018 6:20 PM
To: Liu, Changpeng <redacted>
Cc: virtualization@lists.linux-foundation.org; cavery@redhat.com;
jasowang@redhat.com; pbonzini@redhat.com; Wang, Wei W
[off-list ref]
Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands
support

On Thu, Jun 07, 2018 at 11:07:06PM +0000, Liu, Changpeng wrote:
quoted
quoted
-----Original Message-----
From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
Sent: Thursday, June 7, 2018 9:10 PM
To: Liu, Changpeng <redacted>
Cc: virtualization@lists.linux-foundation.org; cavery@redhat.com;
jasowang@redhat.com; pbonzini@redhat.com; Wang, Wei W
[off-list ref]
Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands
support

On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote:
quoted
Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands
support, this will impact the performance when using SSD backend over
file systems.

Commit 88c85538 "virtio-blk: add discard and write zeroes features to
specification"(see https://github.com/oasis-tcs/virtio-spec) extended
existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES
commands support.

While here, using 16 bytes descriptor to describe one segment of DISCARD
or WRITE ZEROES commands, each command may contain one or more
decriptors.
quoted
The following data structure shows the definition of one descriptor:

struct virtio_blk_discard_write_zeroes {
        le64 sector;
        le32 num_sectors;
        le32 unmap;
};

Field 'sector' means the start sector for DISCARD and WRITE ZEROES,
filed 'num_sectors' means the number of sectors for DISCARD and WRITE
ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap'
maybe used for WRITE ZEROES command with DISCARD enabled.

We also extended the virtio-blk configuration space to let backend
device put DISCARD and WRITE ZEROES configuration parameters.

struct virtio_blk_config {
        [...]

        le32 max_discard_sectors;
        le32 max_discard_seg;
        le32 discard_sector_alignment;
        le32 max_write_zeroes_sectors;
        le32 max_write_zeroes_seg;
        u8 write_zeroes_may_unmap;
}

New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard
command, maximum discard sectors size in field 'max_discard_sectors' and
maximum discard segment number in field 'max_discard_seg'.

New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write
zeroes command, maximum write zeroes sectors size in field
'max_write_zeroes_sectors' and maximum write zeroes segment number in
field 'max_write_zeroes_seg'.

The parameters in the configuration space of the device field
'max_discard_sectors' and field 'discard_sector_alignment' are expressed in
512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The
field 'max_write_zeroes_sectors' is expressed in 512-byte units if the
VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated.

Signed-off-by: Changpeng Liu <redacted>
---
CHANGELOG:
v6: don't set T_OUT bit to discard and write zeroes commands.
I don't see this in the patch...
Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUT
again.
quoted
quoted
quoted
@@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(struct
blk_mq_hw_ctx *hctx,
quoted
 	int qid = hctx->queue_num;
 	int err;
 	bool notify = false;
+	bool unmap = false;
 	u32 type;

 	BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
@@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(struct
blk_mq_hw_ctx *hctx,
quoted
 	case REQ_OP_FLUSH:
 		type = VIRTIO_BLK_T_FLUSH;
 		break;
+	case REQ_OP_DISCARD:
+		type = VIRTIO_BLK_T_DISCARD;
+		break;
+	case REQ_OP_WRITE_ZEROES:
+		type = VIRTIO_BLK_T_WRITE_ZEROES;
+		unmap = !(req->cmd_flags & REQ_NOUNMAP);
+		break;
 	case REQ_OP_SCSI_IN:
 	case REQ_OP_SCSI_OUT:
 		type = VIRTIO_BLK_T_SCSI_CMD;
@@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(struct
blk_mq_hw_ctx *hctx,
quoted
 	blk_mq_start_request(req);

+	if (type == VIRTIO_BLK_T_DISCARD || type ==
VIRTIO_BLK_T_WRITE_ZEROES) {
quoted
+		err = virtblk_setup_discard_write_zeroes(req, unmap);
+		if (err)
+			return BLK_STS_RESOURCE;
+	}
+
 	num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
 	if (num) {
 		if (rq_data_dir(req) == WRITE)
...since we still do blk_rq_map_sg() here and num should be != 0.
No, while here, we should keep the original logic for READ/WRITE commands.
My question is: why does the changelog say "don't set T_OUT" but the
code *will* set it because blk_rq_map_sg() returns != 0 and
rq_data_dir(req) == WRITE?
Since the last bit of DISCARD/WRITE ZEROES commands are already 1, so I said we don't need to set
T_OUT bit to DISCARD/WRITE ZEROES commands again. But the original logic for WRITE, T_OUT is still
needed, so just keep the original code here is fine.
Okay, I understand what you meant now.  Thanks!

Stefan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help