Thread (101 messages) 101 messages, 12 authors, 2012-11-28

Re: [PATCH] x86/mm: Don't flush the TLB on #WP pmd fixups

From: Mel Gorman <mgorman@suse.de>
Date: 2012-11-21 11:47:35
Also in: lkml

On Tue, Nov 20, 2012 at 01:31:56PM +0100, Ingo Molnar wrote:
* Ingo Molnar [off-list ref] wrote:
quoted
* Ingo Molnar [off-list ref] wrote:
quoted
numa/core profile:

    95.66%  perf-1201.map     [.] 0x00007fe4ad1c8fc7                 
     1.70%  libjvm.so         [.] 0x0000000000381581                 
     0.59%  [vdso]            [.] 0x0000000000000607                 
     0.19%  [kernel]          [k] do_raw_spin_lock                   
     0.11%  [kernel]          [k] generic_smp_call_function_interrupt
     0.11%  [kernel]          [k] timekeeping_get_ns.constprop.7     
     0.08%  [kernel]          [k] ktime_get                          
     0.06%  [kernel]          [k] get_cycles                         
     0.05%  [kernel]          [k] __native_flush_tlb                 
     0.05%  [kernel]          [k] rep_nop                            
     0.04%  perf              [.] add_hist_entry.isra.9              
     0.04%  [kernel]          [k] rcu_check_callbacks                
     0.04%  [kernel]          [k] ktime_get_update_offsets           
     0.04%  libc-2.15.so      [.] __strcmp_sse2                      

No page fault overhead (see the page fault rate further below) 
- the NUMA scanning overhead shows up only through some mild 
TLB flush activity (which I'll fix btw).
The patch attached below should get rid of that mild TLB 
flushing activity as well.
This has further increased SPECjbb from 203k/sec to 207k/sec, 
i.e. it's now 5% faster than mainline - THP enabled.

The profile is now totally flat even during a full 32-WH SPECjbb 
run, with the highest overhead entries left all related to timer 
IRQ processing or profiling. That is on a system that should be 
very close to yours.
This is a stab in the dark but are you always running with profiling enabled?

I have not checked this with perf but a number of years ago I found that
oprofile could distort results really badly (7-30% depending on the workload
at the time) when I was evalating hugetlbfs and THP. In some cases I would
find that profiling would show that a patch series improved performance
when the same series showed regressions if profiling was disabled. The
sampling rate had to be reduced quite a bit to avoid this effect.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help