Thread (49 messages) 49 messages, 6 authors, 2021-08-19

Re: [PATCH 2/5] efi/x86: Implement support for unaccepted memory

From: Dave Hansen <hidden>
Date: 2021-08-10 18:01:36
Also in: linux-mm, lkml

...
+void mark_unaccepted(struct boot_params *params, u64 start, u64 num)
+{
Some of these interfaces like accept_memory() take a start/end physical
address.  Having this take a "num pages" is bound to cause confusion.
Could you make these all consistently take start/end physical addresses?
+	u64 end = start + num * PAGE_SIZE;
+	unsigned int npages;

Could you comment those, please?

	/*
	 * The accepted memory bitmap only works at PMD_SIZE
	 * granularity.  If a request comes in to mark memory
	 * as unaccepted which is not PMD_SIZE-aligned, simply
	 * accept the memory now since it can not be *marked* as
	 * unaccepted.
	 */

Then go on to comment the three cases:

	/* Check for ranges which do not span a whole PMD_SIZE area: */
+	if ((start & PMD_MASK) == (end & PMD_MASK)) {
+		npages = (end - start) / PAGE_SIZE;
+		__accept_memory(start, start + npages * PAGE_SIZE);
+		return;
+	}
Hmm, is it possible to have this case hit, but neither of the two below
cases?  This seems to be looking for a case where the range is somehow
entirely contained in one PMD_SIZE area, but where it doesn't consume a
whole area.

Wouldn't that mean that 'start' or 'end' must be unaligned?

+	if (start & ~PMD_MASK) {
+		npages = (round_up(start, PMD_SIZE) - start) / PAGE_SIZE;
+		__accept_memory(start, start + npages * PAGE_SIZE);
+		start = round_up(start, PMD_SIZE);
+	}
+
+	if (end & ~PMD_MASK) {
+		npages = (end - round_down(end, PMD_SIZE)) / PAGE_SIZE;
+		end = round_down(end, PMD_SIZE);
+		__accept_memory(end, end + npages * PAGE_SIZE);
+	}
+	npages = (end - start) / PMD_SIZE;
+	bitmap_set((unsigned long *)params->unaccepted_memory,
+		   start / PMD_SIZE, npages);
+}
Even though it's changed right there, it's a bit cruel to change the
units of 'npages' right in the middle of a function.  It's just asking
for bugs.

It would only take a single extra variable declaration to make this
unambiguous:

	u64 nr_unaccepted_bits;

or something, then you can do:

	nr_unaccepted_bits = (end - start) / PMD_SIZE;
	bitmap_set((unsigned long *)params->unaccepted_memory,
		   start / PMD_SIZE, nr_unaccepted_bits);

...
 static efi_status_t allocate_e820(struct boot_params *params,
+				  struct efi_boot_memmap *map,
 				  struct setup_data **e820ext,
 				  u32 *e820ext_size)
 {
-	unsigned long map_size, desc_size, map_key;
 	efi_status_t status;
-	__u32 nr_desc, desc_version;
-
-	/* Only need the size of the mem map and size of each mem descriptor */
-	map_size = 0;
-	status = efi_bs_call(get_memory_map, &map_size, NULL, &map_key,
-			     &desc_size, &desc_version);
-	if (status != EFI_BUFFER_TOO_SMALL)
-		return (status != EFI_SUCCESS) ? status : EFI_UNSUPPORTED;
I noticed that there's no reference to EFI_BUFFER_TOO_SMALL in the hunks
you added back.  That makes me a bit nervous that this is going to
unintentionally change behavior.

It might be worth having a preparatory reorganization patch for
allocate_e820() before this new feature is added to make this more clear.
+	__u32 nr_desc;
+	bool unaccepted_memory_present = false;
+	u64 max_addr = 0;
+	int i;
 
-	nr_desc = map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS;
+	status = efi_get_memory_map(map);
+	if (status != EFI_SUCCESS)
+		return status;
 
-	if (nr_desc > ARRAY_SIZE(params->e820_table)) {
-		u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table);
+	nr_desc = *map->map_size / *map->desc_size;
+	if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS) {
+		u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table) -
+			EFI_MMAP_NR_SLACK_SLOTS;
 
 		status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size);
 		if (status != EFI_SUCCESS)
 			return status;
 	}
 
+	if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY))
+		return EFI_SUCCESS;
+
+	/* Check if there's any unaccepted memory and find the max address */
+	for (i = 0; i < nr_desc; i++) {
+		efi_memory_desc_t *d;
+
+		d = efi_early_memdesc_ptr(*map->map, *map->desc_size, i);
+		if (d->type == EFI_UNACCEPTED_MEMORY)
+			unaccepted_memory_present = true;
+		if (d->phys_addr + d->num_pages * PAGE_SIZE > max_addr)
+			max_addr = d->phys_addr + d->num_pages * PAGE_SIZE;
+	}
This 'max_addr' variable looks a bit funky.

It *seems* like it's related only to EFI_UNACCEPTED_MEMORY, but it's not
underneath the EFI_UNACCEPTED_MEMORY check.  Is this somehow assuming
that once unaccepted memory as been found that *all* memory found in
later descriptors at higher addresses is also going to be unaccepted?
+	/*
+	 * If unaccepted memory present allocate a bitmap to track what memory
+	 * has to be accepted before access.
+	 *
+	 * One bit in the bitmap represents 2MiB in the address space: one 4k
+	 * page is enough to track 64GiB or physical address space.
+	 *
+	 * In the worst case scenario -- a huge hole in the middle of the
+	 * address space -- we would need 256MiB to handle 4PiB of the address
+	 * space.
+	 *
+	 * TODO: handle situation if params->unaccepted_memory has already set.
+	 * It's required to deal with kexec.
+	 */
+	if (unaccepted_memory_present) {
+		unsigned long *unaccepted_memory = NULL;
+		u64 size = DIV_ROUND_UP(max_addr, PMD_SIZE * BITS_PER_BYTE);
Oh, so the bitmap has to be present for *all* memory, not just
unaccepted memory.  So, we really do need to know the 'max_addr' so that
we can allocate the bitmap for so that can be marked in the bitmap has
having been accepted.
+		status = efi_allocate_pages(size,
+					    (unsigned long *)&unaccepted_memory,
+					    ULONG_MAX);
+		if (status != EFI_SUCCESS)
+			return status;
+		memset(unaccepted_memory, 0, size);
+		params->unaccepted_memory = (u64)unaccepted_memory;
+	}
It might be nice to refer to setup_e820() here to mention that it is the
thing that actually fills out the bitmap.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help