Re: [PATCH 2/2] hugepage: Allow parallelization of the hugepage fault path
From: Eric B Munson <hidden>
Date: 2011-01-25 19:44:23
Also in:
lkml
On Tue, 25 Jan 2011, Anton Blanchard wrote:
From: David Gibson <redacted> At present, the page fault path for hugepages is serialized by a single mutex. This is used to avoid spurious out-of-memory conditions when the hugepage pool is fully utilized (two processes or threads can race to instantiate the same mapping with the last hugepage from the pool, the race loser returning VM_FAULT_OOM). This problem is specific to hugepages, because it is normal to want to use every single hugepage in the system - with normal pages we simply assume there will always be a few spare pages which can be used temporarily until the race is resolved. Unfortunately this serialization also means that clearing of hugepages cannot be parallelized across multiple CPUs, which can lead to very long process startup times when using large numbers of hugepages. This patch improves the situation by replacing the single mutex with a table of mutexes, selected based on a hash of the address_space and file offset being faulted (or mm and virtual address for MAP_PRIVATE mappings). From: Anton Blanchard <redacted> Forward ported and made a few changes: - Use the Jenkins hash to scatter the hash, better than using just the low bits. - Always round num_fault_mutexes to a power of two to avoid an expensive modulus in the hash calculation. I also tested this patch on a 64 thread POWER6 box using a simple parallel fault testcase: http://ozlabs.org/~anton/junkcode/parallel_fault.c Command line options: parallel_fault <nr_threads> <size in kB> <skip in kB> First the time taken to fault 48GB of 16MB hugepages: # time hugectl --heap ./parallel_fault 1 50331648 16384 11.1 seconds Now the same test with 64 concurrent threads: # time hugectl --heap ./parallel_fault 64 50331648 16384 8.8 seconds Hardly any speedup. Finally the 64 concurrent threads test with this patch applied: # time hugectl --heap ./parallel_fault 64 50331648 16384 0.7 seconds We go from 8.8 seconds to 0.7 seconds, an improvement of 12.6x. Signed-off-by: David Gibson <redacted> Signed-off-by: Anton Blanchard <redacted>
Reviewed-by: Eric B Munson <redacted>
Attachments
- signature.asc [application/pgp-signature] 490 bytes