Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: 2018-06-12 16:05:43
On Mon, Jun 11, 2018 at 03:37:00AM +0000, Liu, Changpeng wrote:
quoted
-----Original Message----- From: Stefan Hajnoczi [mailto:stefanha@redhat.com] Sent: Friday, June 8, 2018 6:20 PM To: Liu, Changpeng <redacted> Cc: virtualization@lists.linux-foundation.org; cavery@redhat.com; jasowang@redhat.com; pbonzini@redhat.com; Wang, Wei W [off-list ref] Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support On Thu, Jun 07, 2018 at 11:07:06PM +0000, Liu, Changpeng wrote:quoted
quoted
-----Original Message----- From: Stefan Hajnoczi [mailto:stefanha@redhat.com] Sent: Thursday, June 7, 2018 9:10 PM To: Liu, Changpeng <redacted> Cc: virtualization@lists.linux-foundation.org; cavery@redhat.com; jasowang@redhat.com; pbonzini@redhat.com; Wang, Wei W [off-list ref] Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote:quoted
Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands support, this will impact the performance when using SSD backend over file systems. Commit 88c85538 "virtio-blk: add discard and write zeroes features to specification"(see https://github.com/oasis-tcs/virtio-spec) extended existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES commands support. While here, using 16 bytes descriptor to describe one segment of DISCARD or WRITE ZEROES commands, each command may contain one or moredecriptors.quoted
The following data structure shows the definition of one descriptor: struct virtio_blk_discard_write_zeroes { le64 sector; le32 num_sectors; le32 unmap; }; Field 'sector' means the start sector for DISCARD and WRITE ZEROES, filed 'num_sectors' means the number of sectors for DISCARD and WRITE ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap' maybe used for WRITE ZEROES command with DISCARD enabled. We also extended the virtio-blk configuration space to let backend device put DISCARD and WRITE ZEROES configuration parameters. struct virtio_blk_config { [...] le32 max_discard_sectors; le32 max_discard_seg; le32 discard_sector_alignment; le32 max_write_zeroes_sectors; le32 max_write_zeroes_seg; u8 write_zeroes_may_unmap; } New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard command, maximum discard sectors size in field 'max_discard_sectors' and maximum discard segment number in field 'max_discard_seg'. New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write zeroes command, maximum write zeroes sectors size in field 'max_write_zeroes_sectors' and maximum write zeroes segment number in field 'max_write_zeroes_seg'. The parameters in the configuration space of the device field 'max_discard_sectors' and field 'discard_sector_alignment' are expressed in 512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The field 'max_write_zeroes_sectors' is expressed in 512-byte units if the VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated. Signed-off-by: Changpeng Liu <redacted> --- CHANGELOG: v6: don't set T_OUT bit to discard and write zeroes commands.I don't see this in the patch...Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUTagain.quoted
quoted
quoted
@@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(structblk_mq_hw_ctx *hctx,quoted
int qid = hctx->queue_num; int err; bool notify = false; + bool unmap = false; u32 type; BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);@@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(structblk_mq_hw_ctx *hctx,quoted
case REQ_OP_FLUSH: type = VIRTIO_BLK_T_FLUSH; break; + case REQ_OP_DISCARD: + type = VIRTIO_BLK_T_DISCARD; + break; + case REQ_OP_WRITE_ZEROES: + type = VIRTIO_BLK_T_WRITE_ZEROES; + unmap = !(req->cmd_flags & REQ_NOUNMAP); + break; case REQ_OP_SCSI_IN: case REQ_OP_SCSI_OUT: type = VIRTIO_BLK_T_SCSI_CMD;@@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(structblk_mq_hw_ctx *hctx,quoted
blk_mq_start_request(req); + if (type == VIRTIO_BLK_T_DISCARD || type ==VIRTIO_BLK_T_WRITE_ZEROES) {quoted
+ err = virtblk_setup_discard_write_zeroes(req, unmap); + if (err) + return BLK_STS_RESOURCE; + } + num = blk_rq_map_sg(hctx->queue, req, vbr->sg); if (num) { if (rq_data_dir(req) == WRITE)...since we still do blk_rq_map_sg() here and num should be != 0.No, while here, we should keep the original logic for READ/WRITE commands.My question is: why does the changelog say "don't set T_OUT" but the code *will* set it because blk_rq_map_sg() returns != 0 and rq_data_dir(req) == WRITE?Since the last bit of DISCARD/WRITE ZEROES commands are already 1, so I said we don't need to set T_OUT bit to DISCARD/WRITE ZEROES commands again. But the original logic for WRITE, T_OUT is still needed, so just keep the original code here is fine.
Okay, I understand what you meant now. Thanks! Stefan