Thread (76 messages) 76 messages, 16 authors, 2005-02-09

Re: Prezeroing V2 [0/3]: Why and When it works

From: Andrew Morton <hidden>
Date: 2004-12-23 21:33:33
Also in: lkml

Paul Mackerras [off-list ref] wrote:
Christoph Lameter writes:
quoted
The most expensive operation in the page fault handler is (apart of SMP
locking overhead) the zeroing of the page.
Re-reading this I see that you mean the zeroing of the page that is
mapped into the process address space, not the page table pages.  So
ignore my previous reply.

Do you have any statistics on how often a page fault needs to supply a
page of zeroes versus supplying a copy of an existing page, for real
applications?
When the workload is a gcc run, the pagefault handler dominates the system
time.  That's the page zeroing.
In any case, unless you have magic page-zeroing hardware, I am still
inclined to think that zeroing the page at the time of the fault is
the most efficient, since that means the page will be hot in the cache
for the process to use.  If you zero it earlier using CPU stores, it
can only cause more overall memory traffic, as far as I can see.
x86's movnta instructions provide a way of initialising memory without
trashing the caches and it has pretty good bandwidth, I believe.  We should
wire that up to these patches and see if it speeds things up.
I did some measurements once on my G5 powermac (running a ppc64 linux
kernel) of how long clear_page takes, and it only takes 96ns for a 4kB
page.
40GB/s.  Is that straight into L1 or does the measurement include writeback?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help