Re: Prezeroing V2 [0/3]: Why and When it works

(off-list ancestor, not in this archive)
Increase page fault rate by prezeroing V1 [0/3]: Overview · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [2/3]: zeroing and scrubd · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [1/3]: Introduce __GFP_ZERO · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [2/3]: zeroing and scrubd · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [3/3]: Altix SN2 BTE Zeroing · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [3/3]: Altix SN2 BTE Zeroing · Christoph Lameter <hidden> · 2004-12-21
Increase page fault rate by prezeroing V1 [1/3]: Introduce __GFP_ZERO · Christoph Lameter <hidden> · 2004-12-21
Prezeroing V2 [0/3]: Why and When it works · Christoph Lameter <hidden> · 2004-12-23
Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal · Christoph Lameter <hidden> · 2004-12-23
Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Christoph Lameter <hidden> · 2004-12-23
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Pavel Machek <hidden> · 2004-12-24
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Christoph Lameter <hidden> · 2004-12-24
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Pavel Machek <hidden> · 2004-12-24
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · "David S. Miller" <davem@davemloft.net> · 2004-12-24
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · "David S. Miller" <davem@davemloft.net> · 2004-12-24
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · "David S. Miller" <davem@davemloft.net> · 2004-12-27
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Christoph Lameter <hidden> · 2005-01-03
Re: Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches · Geert Uytterhoeven <geert@linux-m68k.org> · 2005-01-01
Prezeroing V3 [0/4]: Discussion and i386 performance tests · Christoph Lameter <hidden> · 2005-01-04
Prezeroing V3 [3/4]: Page zeroing through kscrubd · Christoph Lameter <hidden> · 2005-01-04
Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-04
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Linus Torvalds <torvalds@osdl.org> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Andrew Morton <hidden> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Dave Hansen <hidden> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Linus Torvalds <torvalds@osdl.org> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-05
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Hugh Dickins <hidden> · 2005-01-08
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · "David S. Miller" <davem@davemloft.net> · 2005-01-08
alloc_zeroed_user_highpage to fix the clear_user_highpage issue · Christoph Lameter <hidden> · 2005-01-21
[Patch] Fix oops in alloc_zeroed_user_highpage() when page is NULL · Michael Ellerman <hidden> · 2005-02-09
Extend clear_page by an order parameter · Christoph Lameter <hidden> · 2005-01-21
Re: Extend clear_page by an order parameter · Paul Mackerras <hidden> · 2005-01-21
Re: Extend clear_page by an order parameter · Christoph Lameter <hidden> · 2005-01-21
Re: Extend clear_page by an order parameter · Paul Mackerras <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Andrew Morton <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Paul Mackerras <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Roman Zippel <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Paul Mackerras <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Christoph Lameter <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Paul Mackerras <hidden> · 2005-01-22
Re: Extend clear_page by an order parameter · Andrew Morton <hidden> · 2005-01-23
Re: Extend clear_page by an order parameter · Christoph Lameter <hidden> · 2005-01-24
Re: Extend clear_page by an order parameter · "David S. Miller" <davem@davemloft.net> · 2005-01-24
Re: Extend clear_page by an order parameter · Christoph Lameter <hidden> · 2005-01-24
A scrub daemon (prezeroing) · Christoph Lameter <hidden> · 2005-01-21
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-10
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Linus Torvalds <torvalds@osdl.org> · 2005-01-10
Re: Prezeroing V3 [1/4]: Allow request for zeroed memory · Christoph Lameter <hidden> · 2005-01-10
Prezeroing V4 [0/4]: Overview · Christoph Lameter <hidden> · 2005-01-11
Prezeroing V4 [1/4]: Arch specific page zeroing during page fault · Christoph Lameter <hidden> · 2005-01-10
Prezeroing V4 [3/4]: Altix SN2 BTE zero driver · Christoph Lameter <hidden> · 2005-01-11
Prezeroing V4 [4/4]: Extend clear_page to take an order parameter · Christoph Lameter <hidden> · 2005-01-11
Prezeroing V4 [2/4]: Zeroing implementation · Christoph Lameter <hidden> · 2005-01-11
Prezeroing V3 [2/4]: Extension of clear_page to take an order parameter · Christoph Lameter <hidden> · 2005-01-04
Re: Prezeroing V3 [2/4]: Extension of clear_page to take an order parameter · Christoph Lameter <hidden> · 2005-01-05
Prezeroing V3 [4/4]: Driver for hardware zeroing on Altix · Christoph Lameter <hidden> · 2005-01-04
Prezeroing V2 [3/4]: Add support for ZEROED and NOT_ZEROED free maps · Christoph Lameter <hidden> · 2004-12-23
Prezeroing V2 [4/4]: Hardware Zeroing through SGI BTE · Christoph Lameter <hidden> · 2004-12-23
Re: Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal · Brian Gerst <hidden> · 2004-12-23
Re: Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal · Christoph Lameter <hidden> · 2004-12-24
Re: Prezeroing V2 [0/3]: Why and When it works · Arjan van de Ven <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Matt Mackall <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Paul Mackerras <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Paul Mackerras <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Andrew Morton <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Paul Mackerras <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Linus Torvalds <torvalds@osdl.org> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Zwane Mwaikambo <hidden> · 2004-12-23
Re: Prezeroing V2 [0/3]: Why and When it works · Arjan van de Ven <hidden> · 2004-12-24
Re: Prezeroing V2 [0/3]: Why and When it works · Linus Torvalds <torvalds@osdl.org> · 2004-12-24
Re: Prezeroing V2 [0/3]: Why and When it works · Arjan van de Ven <hidden> · 2004-12-24
Re: Prezeroing V2 [0/3]: Why and When it works · "David S. Miller" <davem@davemloft.net> · 2004-12-27
Re: Prezeroing V2 [0/3]: Why and When it works · Marcelo Tosatti <hidden> · 2004-12-28
Re: Prezeroing V2 [0/3]: Why and When it works · Christoph Lameter <hidden> · 2004-12-24

From: Andrew Morton <hidden>
Date: 2004-12-23 21:33:33
Also in: lkml

Paul Mackerras [off-list ref] wrote:

Christoph Lameter writes:

quoted

The most expensive operation in the page fault handler is (apart of SMP
locking overhead) the zeroing of the page.

Re-reading this I see that you mean the zeroing of the page that is
mapped into the process address space, not the page table pages.  So
ignore my previous reply.

Do you have any statistics on how often a page fault needs to supply a
page of zeroes versus supplying a copy of an existing page, for real
applications?

When the workload is a gcc run, the pagefault handler dominates the system
time.  That's the page zeroing.

In any case, unless you have magic page-zeroing hardware, I am still
inclined to think that zeroing the page at the time of the fault is
the most efficient, since that means the page will be hot in the cache
for the process to use.  If you zero it earlier using CPU stores, it
can only cause more overall memory traffic, as far as I can see.

x86's movnta instructions provide a way of initialising memory without
trashing the caches and it has pretty good bandwidth, I believe.  We should
wire that up to these patches and see if it speeds things up.

I did some measurements once on my G5 powermac (running a ppc64 linux
kernel) of how long clear_page takes, and it only takes 96ns for a 4kB
page.

40GB/s.  Is that straight into L1 or does the measurement include writeback?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help