Thread (12 messages) 12 messages, 2 authors, 2021-04-19

Re: [PATCH 3/3] powerpc/mm/hash: Avoid multiple HPT resize-downs on memory hotunplug

From: Leonardo Bras <hidden>
Date: 2021-04-09 03:31:35
Also in: lkml

Hello David, thanks for commenting.

On Tue, 2021-03-23 at 10:45 +1100, David Gibson wrote:
quoted
@@ -805,6 +808,10 @@ static int resize_hpt_for_hotplug(unsigned long new_mem_size, bool shrinking)
 	if (shrinking) {

+		/* When batch removing entries, only resizes HPT at the end. */
+		if (atomic_read_acquire(&hpt_resize_disable))
+			return 0;
+
I'm not quite convinced by this locking.  Couldn't hpt_resize_disable
be set after this point, but while you're still inside
resize_hpt_for_hotplug()?  Probably better to use an explicit mutex
(and mutex_trylock()) to make the critical sections clearer.
Sure, I can do that for v2.
Except... do we even need the fancy mechanics to suppress the resizes
in one place to do them elswhere.  Couldn't we just replace the
existing resize calls with the batched ones?
How do you think of having batched resizes-down in HPT? 
Other than the current approach, I could only think of a way that would
touch a lot of generic code, and/or duplicate some functions, as
dlpar_add_lmb() does a lot of other stuff.
quoted
+void hash_memory_batch_shrink_end(void)
+{
+	unsigned long newsize;
+
+	/* Re-enables HPT resize-down after hot-unplug */
+	atomic_set_release(&hpt_resize_disable, 0);
+
+	newsize = memblock_phys_mem_size();
+	/* Resize to smallest SHIFT possible */
+	while (resize_hpt_for_hotplug(newsize, true) == -ENOSPC) {
+		newsize *= 2;
As noted earlier, doing this without an explicit cap on the new hpt
size (of the existing size) this makes me nervous. 
I can add a stop in v2.
 Less so, but doing
the calculations on memory size, rather than explictly on HPT size /
HPT order also seems kinda clunky.
Agree, but at this point, it would seem kind of a waste to find the
shift from newsize, then calculate (1 << shift) for each retry of
resize_hpt_for_hotplug() only to point that we are retrying the order
value.

But sure, if you think it looks better, I can change that. 
quoted
+void memory_batch_shrink_begin(void)
+{
+	if (!radix_enabled())
+		hash_memory_batch_shrink_begin();
+}
+
+void memory_batch_shrink_end(void)
+{
+	if (!radix_enabled())
+		hash_memory_batch_shrink_end();
+}
Again, these wrappers don't seem particularly useful to me.
Options would be add 'if (!radix_enabled())' to hotplug-memory.c
functions or to hash* functions, which look kind of wrong.
quoted
+	memory_batch_shrink_end();
remove_by_index only removes a single LMB, so there's no real point to
batching here.
Sure, will be fixed for v2.
quoted
@@ -700,6 +712,7 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add)
 	if (lmbs_added != lmbs_to_add) {
 		pr_err("Memory hot-add failed, removing any added LMBs\n");

+		memory_batch_shrink_begin();

The effect of these on the memory grow path is far from clear.
On hotplug, HPT is resized-up before adding LMBs.
On hotunplug, HPT is resized-down after removing LMBs.
And each one has it's own mechanism to batch HPT resizes...

I can't understand exactly how using it on hotplug fail path can be any
different than using it on hotunplug.
Can you please help me understanding this?

Best regards,
Leonardo Bras
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help