RE: [PATCH V3 13/13] HV/Storvsc: Add Isolation VM support for storvsc driver
From: Michael Kelley <hidden>
Date: 2021-08-20 16:10:04
Also in:
linux-arch, linux-iommu, linux-scsi, lkml, netdev, xen-devel
From: Tianyu Lan <redacted> Sent: Friday, August 20, 2021 8:20 AM
On 8/20/2021 2:17 AM, Michael Kelley wrote:quoted
From: Tianyu Lan <redacted> Sent: Monday, August 9, 2021 10:56 AM I'm not clear on why payload->range.offset needs to be set again. Even after the dma mapping is done, doesn't the offset in the first page have to be the same? If it wasn't the same, Hyper-V wouldn't be able to process the PFN list correctly. In fact, couldn't the above code just always set offset_in_hvpg = 0?The offset will be changed. The swiotlb bounce buffer is allocated with IO_TLB_SIZE(2K) as unit. So the offset here may be changed.
We need to prevent the offset from changing. The storvsc driver passes just a PFN list to Hyper-V, plus an overall starting offset and length. Unlike the netvsc driver, each entry in the PFN list does *not* have its own offset and length. Hyper-V assumes that the list is "dense" and that there are no holes (i.e., unused memory areas). For example, consider an original buffer passed into storvsc_queuecommand() of 8 Kbytes, but aligned with 1 Kbytes at the end of the first page, then 4 Kbytes in the second page, and 3 Kbytes in the beginning of the third page. The offset of that first 1 Kbytes has to remain as 3 Kbytes. If bounce buffering moves it to a different offset, there's no way to tell Hyper-V to ignore the remaining bytes in the first page (at least not without using a different method to communicate with Hyper-V). In such a case, the wrong data will get transferred. Presumably the easier solution is to set the min_align_mask field as Christop suggested.
quoted
quoted
}The whole approach here is to do dma remapping on each individual page of the I/O buffer. But wouldn't it be possible to use dma_map_sg() to map each scatterlist entry as a unit? Each scatterlist entry describes a range of physically contiguous memory. After dma_map_sg(), the resulting dma address must also refer to a physically contiguous range in the swiotlb bounce buffer memory. So at the top of the "for" loop over the scatterlist entries, do dma_map_sg() if we're in an isolated VM. Then compute the hvpfn value based on the dma address instead of sg_page(). But everything else is the same, and the inner loop for populating the pfn_arry is unmodified. Furthermore, the dma_range array that you've added is not needed, since scatterlist entries already have a dma_address field for saving the mapped address, and dma_unmap_sg() uses that field.I don't use dma_map_sg() here in order to avoid introducing one more loop(e,g dma_map_sg()). We already have a loop to populate cmd_request->dma_range[] and so do the dma map in the same loop.
I'm not seeing where the additional loop comes from. Storvsc already has a loop through the sgl entries. Retain that loop and call dma_map_sg() with nents set to 1. Then the sequence is dma_map_sg() --> dma_map_sg_attrs() --> dma_direct_map_sg() -> dma_direct_map_page(). The latter function will call swiotlb_map() to map all pages of the sgl entry as a single operation. Michael