Thread (26 messages) 26 messages, 6 authors, 2021-03-31

RE: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock

From: Song Bao Hua (Barry Song) <hidden>
Date: 2021-03-30 08:17:50
Also in: lkml

-----Original Message-----
From: Muchun Song [mailto:songmuchun@bytedance.com]
Sent: Tuesday, March 30, 2021 9:09 PM
To: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <redacted>; Linux Memory Management List
[off-list ref]; LKML [off-list ref]; Roman Gushchin
[off-list ref]; Shakeel Butt [off-list ref]; Oscar Salvador
[off-list ref]; David Hildenbrand [off-list ref]; David Rientjes
[off-list ref]; linmiaohe [off-list ref]; Peter Zijlstra
[off-list ref]; Matthew Wilcox [off-list ref]; HORIGUCHI NAOYA
[off-list ref]; Aneesh Kumar K . V [off-list ref];
Waiman Long [off-list ref]; Peter Xu [off-list ref]; Mina Almasry
[off-list ref]; Hillf Danton [off-list ref]; Joonsoo Kim
[off-list ref]; Song Bao Hua (Barry Song)
[off-list ref]; Will Deacon [off-list ref]; Andrew Morton
[off-list ref]
Subject: Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe
spinlock

On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko [off-list ref] wrote:
quoted
On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
quoted
Ideally, cma_release could be called from any context.  However,
that is not possible because a mutex is used to protect the per-area bitmap.
Change the bitmap to an irq safe spinlock.
I would phrase the changelog slightly differerent "
cma_release is currently a sleepable operatation because the bitmap
manipulation is protected by cma->lock mutex. Hugetlb code which
relies on cma_release for CMA backed (giga) hugetlb pages, however,
needs to be irq safe.

The lock doesn't protect any sleepable operation so it can be changed
to a (irq aware) spin lock. The bitmap processing should be quite fast
in typical case but if cma sizes grow to TB then we will likely need
to replace the lock by a more optimized bitmap implementation.
"

it seems that you are overusing irqsave variants even from context
which are never called from the IRQ context so they do not need storing flags.

[...]
quoted
@@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma)
      unsigned long start = 0;
      unsigned long nr_part, nr_total = 0;
      unsigned long nbits = cma_bitmap_maxno(cma);
+     unsigned long flags;

-     mutex_lock(&cma->lock);
+     spin_lock_irqsave(&cma->lock, flags);
spin_lock_irq should be sufficient. This is only called from the
allocation context and that is never called from IRQ context.
This makes me think more. I think that spin_lock should be sufficient. Right?
It seems Mike's point is that cma_release might be called from both
irq context and process context.

If it is running in process context, we need the irq-disable to lock
the irq context which might jump to call cma_release at the same time.

We have never seen cma_release has been really called in irq context
by now, anyway.
quoted
quoted
      pr_info("number of available pages: ");
      for (;;) {
              next_zero_bit = find_next_zero_bit(cma->bitmap, nbits,
start); @@ -407,7 +408,7 @@ static void cma_debug_show_areas(struct cma
*cma)
quoted
quoted
              start = next_zero_bit + nr_zero;
      }
      pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count);
-     mutex_unlock(&cma->lock);
+     spin_unlock_irqrestore(&cma->lock, flags);
 }
 #else
 static inline void cma_debug_show_areas(struct cma *cma) { } @@
-430,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, size_t count,
unsigned int align,
quoted
quoted
      unsigned long pfn = -1;
      unsigned long start = 0;
      unsigned long bitmap_maxno, bitmap_no, bitmap_count;
+     unsigned long flags;
      size_t i;
      struct page *page = NULL;
      int ret = -ENOMEM;
@@ -454,12 +456,12 @@ struct page *cma_alloc(struct cma *cma, size_t count,
unsigned int align,
quoted
quoted
              goto out;

      for (;;) {
-             mutex_lock(&cma->lock);
+             spin_lock_irqsave(&cma->lock, flags);
              bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
                              bitmap_maxno, start, bitmap_count, mask,
                              offset);
              if (bitmap_no >= bitmap_maxno) {
-                     mutex_unlock(&cma->lock);
+                     spin_unlock_irqrestore(&cma->lock, flags);
                      break;
              }
              bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
same here.
quoted
@@ -468,7 +470,7 @@ struct page *cma_alloc(struct cma *cma, size_t count,
unsigned int align,
quoted
quoted
               * our exclusive use. If the migration fails we will take the
               * lock again and unmark it.
               */
-             mutex_unlock(&cma->lock);
+             spin_unlock_irqrestore(&cma->lock, flags);

              pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
              ret = alloc_contig_range(pfn, pfn + count,
MIGRATE_CMA, diff --git a/mm/cma.h b/mm/cma.h index
68ffad4e430d..2c775877eae2 100644
--- a/mm/cma.h
+++ b/mm/cma.h
@@ -15,7 +15,7 @@ struct cma {
      unsigned long   count;
      unsigned long   *bitmap;
      unsigned int order_per_bit; /* Order of pages represented by one bit
*/
quoted
quoted
-     struct mutex    lock;
+     spinlock_t      lock;
 #ifdef CONFIG_CMA_DEBUGFS
      struct hlist_head mem_head;
      spinlock_t mem_head_lock;
diff --git a/mm/cma_debug.c b/mm/cma_debug.c index
d5bf8aa34fdc..6379cfbfd568 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -35,11 +35,12 @@ static int cma_used_get(void *data, u64 *val)  {
      struct cma *cma = data;
      unsigned long used;
+     unsigned long flags;

-     mutex_lock(&cma->lock);
+     spin_lock_irqsave(&cma->lock, flags);
      /* pages counter is smaller than sizeof(int) */
      used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma));
-     mutex_unlock(&cma->lock);
+     spin_unlock_irqrestore(&cma->lock, flags);
      *val = (u64)used << cma->order_per_bit;
same here
quoted
      return 0;
@@ -52,8 +53,9 @@ static int cma_maxchunk_get(void *data, u64 *val)
      unsigned long maxchunk = 0;
      unsigned long start, end = 0;
      unsigned long bitmap_maxno = cma_bitmap_maxno(cma);
+     unsigned long flags;

-     mutex_lock(&cma->lock);
+     spin_lock_irqsave(&cma->lock, flags);
      for (;;) {
              start = find_next_zero_bit(cma->bitmap, bitmap_maxno, end);
              if (start >= bitmap_maxno)
@@ -61,7 +63,7 @@ static int cma_maxchunk_get(void *data, u64 *val)
              end = find_next_bit(cma->bitmap, bitmap_maxno, start);
              maxchunk = max(end - start, maxchunk);
      }
-     mutex_unlock(&cma->lock);
+     spin_unlock_irqrestore(&cma->lock, flags);
      *val = (u64)maxchunk << cma->order_per_bit;

      return 0;
and here.
--
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help