Re: [RFC 2.6.28 1/2] fbdev: add ability to set damage
From: "Magnus Damm" <magnus.damm@gmail.com>
Date: 2009-01-16 03:09:20
On Thu, Jan 15, 2009 at 8:08 PM, Jaya Kumar [off-list ref] wrote:
On Thu, Jan 15, 2009 at 5:29 AM, Magnus Damm [off-list ref] wrote:quoted
I agree with Tomi about the memory allocation.Yes, I agree with that too. :-) I proposed pushing that decision into the driver so that it could decide for itself whether it wants to allocate a buffer or retain a fixed structure.
Sure, letting the driver decide things depending on the type of hardware sounds like a good plan.
quoted
I wonder how fine grained control that is needed. It's not an exact science, right? If a slightly larger area is updated than what isAgreed that it is not a one-approach fits all scenario.quoted
needed then we will take a performance hit, but things should work as expected apart from that right?I'm not sure I understood this. Why do you say "If a large area is updated, then we will take a performance hit."? I think that statement depends on the device, right? I agree that if a lot of pixels are updated, then there is a lot of data to transfer, but beyond that it is very much dependent on the device, whether it uses DMA, what kind of update latency it has, what kind of partial update capability it has, all of which affect how much of a performance hit is taken and what the optimal case would be.
Sorry for my poor selection of words. I agree that it's device dependent, but what I was trying to say is that a lossy conversion to a larger area is ok if i've understood things correctly. We will have correct behavior but performance degradation if the user space program asks to update a small rectangle in the middle of the screen but the driver or some layer in between decides to update say the entire screen instead. Do you agree with me?
quoted
I'm a big fan of simple things like bitmaps. I wonder if it's a good idea to divide the entire frame buffer into equally sized X*Y tiles and have a bitmap of dirty bits. A "1" in the bitmap means tile is dirty and needs update and a "0" means no need to update. The best tile size is application specific. The size of the bitmap varies of course with the tile size. For a 1024x768 display using 32x32 tiles we need 24 32-bit words. That's pretty small and simple, no?
Just trying to pitch my idea a bit harder: The above example would need a 96 bytes bitmap which will fit in just a few cache lines. This arrangement of the data gives you good performance compared to multiple allocations scattered all over the place. Also, using a bitmap makes it at least half-easy to do a lossy OR operation of all damage rectangles. Who is taking care of overlapping updates otherwise - some user space library? I'd say we would benefit from managing the OR operation within the kernel since deferred io may collect a lot of overlapping areas over time. Actually, we sort of do that already by touching the pages in the deferred io mmap handling code. If we won't do any OR operation within the kernel for deferred io, then how are we supposed to handle long deferred io delays? Just keep on kmallocing rectangles? Or expanding the rectangles? Or maybe we are discussing apples and oranges? Is your damage API is meant to force a screen update so there is no need for in-kernel OR operation? We have a need for in-kernel OR operation with deferred io already I think, so there is some overlap in my opinion. .
Okay, I just realized that I neglected to mention the XDamage extension which had a big influence on me. I think the following page: http://www.freedesktop.org/wiki/Software/XDamage and: http://www.opensource.apple.com/darwinsource/Current/X11proto-15.1/damageproto/damageproto-1.1.0/damageproto.txt explain a lot of thinking that has gone into solving similar issues. I think the fact that Xfbdev and Xorg utilize that rectangle and rectangle count based infrastructure would push us towards retaining the same concepts. In my mind, Xfbdev/Xorg would be the prime candidate for this API.
Thanks for the pointers. I'm not saying that using rectangles is a bad thing, I just wonder if there are better data structures available for backing the dirty screen area. I'd say that a combination of rectangle based user space damage API _and_ (maybe tile based) in-kernel dirty area OR operation is the best approach. This because XDamage is rectangle based and the deferred io delay (ie amount of time to collect dirty areas) is a kernel driver property. Cheers, / magnus ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword