Re: [PATCH v2] kernel/resource: Fix locking in request_free_mem_region
From: Alistair Popple <apopple@nvidia.com>
Date: 2021-03-31 06:22:01
Also in:
lkml
On Tuesday, 30 March 2021 8:13:32 PM AEDT David Hildenbrand wrote:
External email: Use caution opening links or attachments On 29.03.21 03:37, Alistair Popple wrote:quoted
On Friday, 26 March 2021 7:57:51 PM AEDT David Hildenbrand wrote:quoted
On 26.03.21 02:20, Alistair Popple wrote:quoted
request_free_mem_region() is used to find an empty range of physical addresses for hotplugging ZONE_DEVICE memory. It does this by iterating over the range of possible addresses using region_intersects() to see if the range is free.Just a high-level question: how does this iteract with memory hot(un)plug? IOW, how defines and manages the "range of possible addresses" ?Both the driver and the maximum physical address bits available define the range of possible addresses for device private memory. From __request_free_mem_region(): end = min_t(unsigned long, base->end, (1UL << MAX_PHYSMEM_BITS) - 1); addr = end - size + 1UL; There is no lower address range bound here so it is effectively zero. The
code
quoted
will try to allocate the highest possible physical address first and
continue
quoted
searching down for a free block. Does that answer your question?Oh, sorry, the fist time I had a look I got it wrong - I thought (1UL << MAX_PHYSMEM_BITS) would be the lower address limit. That looks indeed problematic to me. You might end up reserving an iomem region that could be used e.g., by memory hotplug code later. If someone plugs a DIMM or adds memory via different approaches (virtio-mem), memory hotplug (via add_memory()) would fail. You never should be touching physical memory area reserved for memory hotplug, i.e., via SRAT. What is the expectation here?
Most drivers call request_free_mem_region() with iomem_resource as the base. So zone device private pages currently tend to get allocated from the top of that. By definition ZONE_DEVICE private pages are unaddressable from the CPU. So in terms of expectation I think all that is really required for ZONE_DEVICE private pages (at least for Nouveau) is a valid range of physical addresses that allow page_to_pfn() and pfn_to_page() to work correctly. To make this work drivers add the pages via memremap_pages() -> pagemap_range() -> add_pages(). - Alistair
-- Thanks, David / dhildenb