Re: [PATCH 0/5] x86: Impplement support for unaccepted memory
From: Kirill A. Shutemov <hidden>
Date: 2021-08-10 17:31:56
Also in:
linux-mm, lkml
On Tue, Aug 10, 2021 at 08:51:01AM -0700, Dave Hansen wrote:
In other words, I buy the boot speed argument. But, I don't buy the "this saves memory long term" argument at all.
Okay, that's a fair enough. I guess there's *some* workloads that may have memory footprint reduced, but I agree it's minority.
quoted
quoted
I had expected this series, but I also expected it to be connected to CONFIG_DEFERRED_STRUCT_PAGE_INIT somehow. Could you explain a bit how this problem is different and demands a totally orthogonal solution? For instance, what prevents us from declaring: "Memory is accepted at the time that its 'struct page' is initialized" ? Then, we use all the infrastructure we already have for DEFERRED_STRUCT_PAGE_INIT.That was my first thought too and I tried it just to realize that it is not what we want. If we would accept page on page struct init it means we would make host allocate all memory assigned to the guest on boot even if guest actually use small portion of it. Also deferred page init only allows to scale memory accept across multiple CPUs, but doesn't allow to get to userspace before we done with it. See wait_for_completion(&pgdat_init_all_done_comp).That's good information. It's a refinement of the "I want to boot faster" requirement. What you want is not just going _faster_, but being able to run userspace before full acceptance has completed. Would you be able to quantify how fast TDX page acceptance is? Are we talking about MB/s, GB/s, TB/s? This series is rather bereft of numbers for a feature which making a performance claim. Let's say we have a 128GB VM. How much does faster does this approach reach userspace than if all memory was accepted up front? How much memory _could_ have been accepted at the point userspace starts running?
Acceptance code is not optimized yet: we accept memory in 4k chunk which is very slow because hypercall overhead dominates the picture. As of now, kernel boot time of 1 VCPU and 64TiB VM with upfront memory accept is >20 times slower than with this lazy memory accept approach. The difference is going to be substantially lower once we get it optimized properly. -- Kirill A. Shutemov