Thread (41 messages) 41 messages, 10 authors, 2005-09-17

Re: clear_user_highpage()

From: Linus Torvalds <torvalds@osdl.org>
Date: 2004-08-12 02:18:27


On Wed, 11 Aug 2004, William Lee Irwin III wrote:
Results from prototype prezeroing patches (ca. 2001) showed that
dedicating a cpu on a 16x machine to prezeroing userspace pages (doing
no other work on that cpu) improved kernel compile (insert sound of
projectile vomiting here) "benchmarks". This suggests cache pollution
and scheduling latency can be circumvented under some circumstances.
Heh.

And at what point does it become a problem? Caches are growing, at some 
point it is going to be a loss to zero memory on another CPU..

I really do believe (but can't back it up with any real numbers) that we 
want to try to keep pages in cache as long as possible. That means keeping 
the pages close to the last CPU that used them, btw.

It would be interesting to see if we could make the buddy allocator more
"per-cpu" friendly, for example - I suspect that would make much _more_ of
a difference than pre-zeroing pages. 

As it is, the pages we allocate have _no_ CPU affinity (unlike 
kmalloc/slab), and as a result they aren't even very likely to be in the 
cache even if you have tons of cache on the CPU. 

And my whole argument against pre-zeroing really falls totally flat if the 
pages aren't in the cache. 

So I'd personally be a whole lot more interested in seeing whether we 
could have per-CPU pages than in pre-zeroing. 

Fragmentation of memory is the _big_ problem, of course. It comes up
almost for _any_ page allocation issue. But it might be interesting to see 
if we could have a special per-cpu "page pool" for some usage. Sized 
fairly small - on the order of a few times the CPU cache size - and used 
for anonymous pages that we think might be short-lived.

		Linus
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help