Thread (50 messages) 50 messages, 5 authors, 2012-12-18

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

From: Gleb Natapov <hidden>
Date: 2012-11-27 15:45:02
Also in: lkml

On Tue, Nov 27, 2012 at 03:38:14PM +0100, Frederic Weisbecker wrote:
2012/11/27 Li Zhong [off-list ref]:
quoted
I noticed some warnings complaining about dynticks_nesting value, like

[  267.545032] ------------[ cut here ]------------
[  267.545032] WARNING: at kernel/rcutree.c:382 rcu_eqs_enter+0xab/0xc0()
[  267.545032] Hardware name: Bochs
[  267.545032] Modules linked in:
[  267.545032] Pid: 0, comm: swapper/2 Not tainted 3.7.0-rc5-next-20121115 #8
[  267.545032] Call Trace:
[  267.545032]  [<ffffffff8104714f>] warn_slowpath_common+0x7f/0xc0
[  267.545032]  [<ffffffff810471aa>] warn_slowpath_null+0x1a/0x20
[  267.545032]  [<ffffffff810e607b>] rcu_eqs_enter+0xab/0xc0
[  267.545032]  [<ffffffff810e60bb>] rcu_idle_enter+0x2b/0x70
[  267.545032]  [<ffffffff8100d44f>] cpu_idle+0x6f/0x100
[  267.545032]  [<ffffffff814bf055>] start_secondary+0x205/0x20c
[  267.545032] ---[ end trace 924ae80da035028d ]---

After enabling rcu-dyntick tracing, I got following abnormal
dynticks_nesting values (13fffffffffffff, ff00000000000001,etc):
                        ...
 1      <idle>-0     [002] dN.2 18739.518567: rcu_dyntick: End 0 140000000000000                rcu_idle_exit
 2        sshd-696   [002] d..1 18739.518675: rcu_dyntick: ++= 140000000000000 140000000000001  rcu_irq_enter   - apf (not present)
How did that happen? When I look at do_async_page_fault(),
KVM_PV_REASON_PAGE_NOT_PRESENT doesn't do rcu_irq_enter().
quoted
 3      <idle>-0     [002] d..2 18739.518705: rcu_dyntick: Start 140000000000001 0              rcu_idle_enter
 4      <idle>-0     [002] d..2 18739.521252: rcu_dyntick: End 0 1                              rcu_irq_enter   - apf (page ready)
 5      <idle>-0     [002] dN.2 18739.521261: rcu_dyntick: Start 1 0                            rcu_irq_exit    - apf (page ready)
 6      <idle>-0     [002] dN.2 18739.521263: rcu_dyntick: End 0 140000000000000                rcu_idle_exit

 7        sshd-696   [002] d..1 18739.521299: rcu_dyntick: --= 140000000000000 13fffffffffffff  rcu_irq_exit    - apf (not present)
I'm confused for the same reason here.
quoted
 8        sshd-696   [002] d..1 18739.521302: rcu_dyntick: Start 13fffffffffffff 0              rcu_user_enter
 9        sshd-696   [002] d..1 18739.521330: rcu_dyntick: End 0 1                              rcu_irq_enter   - apf (not present)
Same. Now we certainly need to add some rcu_user_exit() on
do_async_page_fault(). Although I'm not quite sure when this function
is called. Is it an exception or an irq?
For KVM_PV_REASON_PAGE_NOT_PRESENT it behaves like an exception.

--
			Gleb.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help