Thread (19 messages) 19 messages, 8 authors, 2008-06-27

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

From: Gunnar von Boehn <hidden>
Date: 2008-06-27 13:30:58

Hi Paul,

In my experience, dcbz slows down the hot-cache case because it adds a
few cycles to the execution time of the inner loop, and on most 64-bit
PowerPC implementations, it doesn't actually help even in the
cold-cache case because the store queue does enough write combining
I agree with you that on POWER the dcbz is probably not helping.

On PowerPC my experience is different.
From what I have seen DCBZ help enormously on 970,PA-Semi and CELL.

Cheers
Gunnar



                                                                           
             Paul Mackerras                                                
             <paulus@samba.org                                             
             >                                                          To 
                                       Gunnar von                          
             24/06/2008 01:49          Boehn/Germany/Contr/IBM@IBMDE       
                                                                        cc 
                                       sanjay3000@yahoo.com, Mark Nelson   
                                       [off-list ref],                
                                       linuxppc-dev@ozlabs.org, Michael    
                                       Ellerman [off-list ref],    
                                       cbe-oss-dev@ozlabs.org, Arnd        
                                       Bergmann [off-list ref]            
                                                                   Subject 
                                       Re: [RFC 1/3] powerpc:              
                                       __copy_tofrom_user tweaked for Cell 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Gunnar von Boehn writes:
Interesting points.
Can you help me to understand where the negative effect of DCBZ does come
from?
In my experience, dcbz slows down the hot-cache case because it adds a
few cycles to the execution time of the inner loop, and on most 64-bit
PowerPC implementations, it doesn't actually help even in the
cold-cache case because the store queue does enough write combining
that the cache doesn't end up reading the line from memory.  I don't
know whether the Cell PPE can do that, but I could believe that it
can't.

Paul.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help