Thread (18 messages) 18 messages, 5 authors, 2009-01-22

Re: [RFC 2.6.28 1/2] fbdev: add ability to set damage

From: "Magnus Damm" <magnus.damm@gmail.com>
Date: 2009-01-16 03:09:20

On Thu, Jan 15, 2009 at 8:08 PM, Jaya Kumar [off-list ref] wrote:
On Thu, Jan 15, 2009 at 5:29 AM, Magnus Damm [off-list ref] wrote:
quoted
I agree with Tomi about the memory allocation.
Yes, I agree with that too. :-) I proposed pushing that decision into
the driver so that it could decide for itself whether it wants to
allocate a buffer or retain a fixed structure.
Sure, letting the driver decide things depending on the type of
hardware sounds like a good plan.
quoted
I wonder how fine grained control that is needed. It's not an exact
science, right? If a slightly larger area is updated than what is
Agreed that it is not a one-approach fits all scenario.
quoted
needed then we will take a performance hit, but things should work as
expected apart from that right?
I'm not sure I understood this. Why do you say "If a large area is
updated, then we will take a performance hit."? I think that statement
depends on the device, right? I agree that if a lot of pixels are
updated, then there is a lot of data to transfer, but beyond that it
is very much dependent on the device, whether it uses DMA, what kind
of update latency it has, what kind of partial update capability it
has, all of which affect how much of a performance hit is taken and
what the optimal case would be.
Sorry for my poor selection of words. I agree that it's device
dependent, but what I was trying to say is that a lossy conversion to
a larger area is ok if i've understood things correctly.

We will have correct behavior but performance degradation if the user
space program asks to update a small rectangle in the middle of the
screen but the driver or some layer in between decides to update say
the entire screen instead. Do you agree with me?
quoted
I'm a big fan of simple things like bitmaps. I wonder if it's a good
idea to divide the entire frame buffer into equally sized X*Y tiles
and have a bitmap of dirty bits. A "1" in the bitmap means tile is
dirty and needs update and a "0" means no need to update. The best
tile size is application specific. The size of the bitmap varies of
course with the tile size.

For a 1024x768 display using 32x32 tiles we need 24 32-bit words.
That's pretty small and simple, no?
Just trying to pitch my idea a bit harder: The above example would
need a 96 bytes bitmap which will fit in just a few cache lines. This
arrangement of the data gives you good performance compared to
multiple allocations scattered all over the place.

Also, using a bitmap makes it at least half-easy to do a lossy OR
operation of all damage rectangles. Who is taking care of overlapping
updates otherwise - some user space library?

I'd say we would benefit from managing the OR operation within the
kernel since deferred io may collect a lot of overlapping areas over
time. Actually, we sort of do that already by touching the pages in
the deferred io mmap handling code. If we won't do any OR operation
within the kernel for deferred io, then how are we supposed to handle
long deferred io delays? Just keep on kmallocing rectangles? Or
expanding the rectangles?

Or maybe we are discussing apples and oranges? Is your damage API is
meant to force a screen update so there is no need for in-kernel OR
operation? We have a need for in-kernel OR operation with deferred io
already I think, so there is some overlap in my opinion.
.
Okay, I just realized that I neglected to mention the XDamage
extension which had a big influence on me. I think the following page:
http://www.freedesktop.org/wiki/Software/XDamage
and:
http://www.opensource.apple.com/darwinsource/Current/X11proto-15.1/damageproto/damageproto-1.1.0/damageproto.txt
explain a lot of thinking that has gone into solving similar issues.

I think the fact that Xfbdev and Xorg utilize that rectangle and
rectangle count based infrastructure would push us towards retaining
the same concepts. In my mind, Xfbdev/Xorg would be the prime
candidate for this API.
Thanks for the pointers. I'm not saying that using rectangles is a bad
thing, I just wonder if there are better data structures available for
backing the dirty screen area.

I'd say that a combination of rectangle based user space damage API
_and_ (maybe tile based) in-kernel dirty area OR operation is the best
approach. This because XDamage is rectangle based and the deferred io
delay (ie amount of time to collect dirty areas) is a kernel driver
property.

Cheers,

/ magnus

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help