Thread (3 messages) 3 messages, 2 authors, 2022-07-08

Re: [PATCH V3] swiotlb: Split up single swiotlb lock

From: Christoph Hellwig <hch@infradead.org>
Date: 2022-07-07 17:07:18
Also in: linux-doc, lkml
Subsystem: dma mapping helpers, the rest · Maintainers: Marek Szyprowski, Linus Torvalds

On Thu, Jul 07, 2022 at 04:24:36AM -0400, Tianyu Lan wrote:
From: Tianyu Lan <redacted>

Traditionally swiotlb was not performance critical because it was only
used for slow devices. But in some setups, like TDX/SEV confidential
guests, all IO has to go through swiotlb. Currently swiotlb only has a
single lock. Under high IO load with multiple CPUs this can lead to
significat lock contention on the swiotlb lock.

This patch splits the swiotlb bounce buffer pool into individual areas
which have their own lock. Each CPU tries to allocate in its own area
first. Only if that fails does it search other areas. On freeing the
allocation is freed into the area where the memory was originally
allocated from.

Area number can be set via swiotlb kernel parameter and is default
to be possible cpu number. If possible cpu number is not power of
2, area number will be round up to the next power of 2.

This idea from Andi Kleen patch(https://github.com/intel/tdx/commit/
4529b5784c141782c72ec9bd9a92df2b68cb7d45).
Thanks, this looks much better.  I think there is a small problem
with how default_nareas is set - we need to use 0 as the default
so that an explicit command line value of 1 works.  Als have you
checked the interaction with swiotlb_adjust_size in detail?
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 5536d2cd69d30..85b1c29dd0eb8 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -70,7 +70,7 @@ struct io_tlb_mem io_tlb_default_mem;
 phys_addr_t swiotlb_unencrypted_base;
 
 static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
-static unsigned long default_nareas = 1;
+static unsigned long default_nareas;
 
 /**
  * struct io_tlb_area - IO TLB memory area descriptor
@@ -90,7 +90,10 @@ struct io_tlb_area {
 
 static void swiotlb_adjust_nareas(unsigned int nareas)
 {
-	if (!is_power_of_2(nareas))
+	if (default_nareas)
+		return;
+
+	if (nareas > 1 && !is_power_of_2(nareas))
 		nareas = roundup_pow_of_two(nareas);
 
 	default_nareas = nareas;
@@ -338,8 +341,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	if (default_nareas == 1)
-		swiotlb_adjust_nareas(num_possible_cpus());
+	swiotlb_adjust_nareas(num_possible_cpus());
 
 	mem->areas = memblock_alloc(sizeof(struct io_tlb_area) *
 		default_nareas, SMP_CACHE_BYTES);
@@ -410,8 +412,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
 			(PAGE_SIZE << order) >> 20);
 	}
 
-	if (default_nareas == 1)
-		swiotlb_adjust_nareas(num_possible_cpus());
+	swiotlb_adjust_nareas(num_possible_cpus());
 
 	area_order = get_order(array_size(sizeof(*mem->areas),
 		default_nareas));
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help