Thread (1 message) 1 message, 1 author, 2017-10-11

Re: [PATCH v16 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

From: Wei Wang <hidden>
Date: 2017-10-11 03:16:55

On 10/11/2017 10:26 AM, Tetsuo Handa wrote:
Wei Wang wrote:
quoted
On 10/10/2017 09:09 PM, Tetsuo Handa wrote:
quoted
Wei Wang wrote:
quoted
quoted
And even if we could remove balloon_lock, you still cannot use
__GFP_DIRECT_RECLAIM at xb_set_page(). I think you will need to use
"whether it is safe to wait" flag from
"[PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify()" .
Without the lock being held, why couldn't we use __GFP_DIRECT_RECLAIM at
xb_set_page()?
Because of dependency shown below.

leak_balloon()
   xb_set_page()
     xb_preload(GFP_KERNEL)
       kmalloc(GFP_KERNEL)
         __alloc_pages_may_oom()
           Takes oom_lock
           out_of_memory()
             blocking_notifier_call_chain()
               leak_balloon()
                 xb_set_page()
                   xb_preload(GFP_KERNEL)
                     kmalloc(GFP_KERNEL)
                       __alloc_pages_may_oom()
                         Fails to take oom_lock and loop forever
__alloc_pages_may_oom() uses mutex_trylock(&oom_lock).
Yes. But this mutex_trylock(&oom_lock) is semantically mutex_lock(&oom_lock)
because __alloc_pages_slowpath() will continue looping until
mutex_trylock(&oom_lock) succeeds (or somebody releases memory).
quoted
I think the second __alloc_pages_may_oom() will not continue since the
first one is in progress.
The second __alloc_pages_may_oom() will be called repeatedly because
__alloc_pages_slowpath() will continue looping (unless somebody releases
memory).
OK, I see, thanks. So, the point is that the OOM code path should not
have memory allocation, and the
old leak_balloon (without the F_SG feature) don't need xb_preload(). I
think one solution would be to let
the OOM uses the old leak_balloon() code path, and we can add one more
parameter to leak_balloon
to control that:

leak_balloon(struct virtio_balloon *vb, size_t num, bool oom)


quoted
quoted
By the way, is xb_set_page() safe?
Sleeping in the kernel with preemption disabled is a bug, isn't it?
__radix_tree_preload() returns 0 with preemption disabled upon success.
xb_preload() disables preemption if __radix_tree_preload() fails.
Then, kmalloc() is called with preemption disabled, isn't it?
But xb_set_page() calls xb_preload(GFP_KERNEL) which might sleep with
preemption disabled.
Yes, I think that should not be expected, thanks.

I plan to change it like this:

bool xb_preload(gfp_t gfp)
{
        if (!this_cpu_read(ida_bitmap)) {
                struct ida_bitmap *bitmap = kmalloc(sizeof(*bitmap), gfp);

                if (!bitmap)
                        return false;
                bitmap = this_cpu_cmpxchg(ida_bitmap, NULL, bitmap);
                kfree(bitmap);
        }
Excuse me, but you are allocating per-CPU memory when running CPU might
change at this line? What happens if running CPU has changed at this line?
Will it work even with new CPU's ida_bitmap == NULL ?

Yes, it will be detected in xb_set_bit(): when ida_bitmap = NULL on the
new CPU, xb_set_bit() will
return -EAGAIN to the caller, and the caller should restart from
xb_preload().

Best,
Wei
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help