Thread (8 messages) 8 messages, 3 authors, 2015-07-29

Re: [SLOF PATCH 1/2] fbuffer: Improve invert-region helper

From: Segher Boessenkool <hidden>
Date: 2015-07-28 17:04:16

On Tue, Jul 28, 2015 at 12:19:54PM +0200, Thomas Huth wrote:
 : invert-region ( addr len -- )
-   0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
-;
-
-: invert-region-x ( addr len -- )
-   /x / 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP drop
+   2dup or 7 and CASE
+      0 OF 3 rshift 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP ENDOF
+      2 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
+      4 OF 2 rshift 0 ?DO dup dup rl@ -1 xor swap rl! la1+ LOOP ENDOF
+      6 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
+      dup OF 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP ENDOF
+   ENDCASE
+   drop
 ;
Can you access device memory as 64 bits for all supported devices?

You can get a bigger speedup by writing some of the core blitting
functions in C, btw.

A small simplification:

   2dup or 7 and CASE
      0 OF 3 rshift 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP ENDOF
      4 OF 2 rshift 0 ?DO dup dup rl@ -1 xor swap rl! la1+ LOOP ENDOF
      3 and
      2 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
      dup OF 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP ENDOF
   ENDCASE


If this code is often called unaligned, it makes more sense to special-
case the begin and end probably.


Segher
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help