Re: fbdev: Garbage collect fbdev scrolling acceleration
From: Sven Schnelle <hidden>
Date: 2022-01-19 16:34:05
Also in:
dri-devel
Hi Daniel, Daniel Vetter [off-list ref] writes:
On Wed, Jan 19, 2022 at 05:15:44PM +0100, Sven Schnelle wrote:quoted
Hi Daniel, Daniel Vetter [off-list ref] writes:quoted
On Thu, Jan 13, 2022 at 10:46:03PM +0100, Sven Schnelle wrote:quoted
Helge Deller [off-list ref] writes:quoted
Maybe on fast new x86 boxes the performance difference isn't huge, but for all old systems, or when emulated in qemu, this makes a big difference. HelgeI second that. For most people, the framebuffer isn't important as they're mostly interested in getting to X11/wayland as fast as possible. But for systems like servers without X11 it's nice to have a fast console.Fast console howto: - shadow buffer in cached memory - timer based upload of changed areas to the real framebuffer This one is actually fast, instead of trying to use hw bltcopy and having the most terrible fallback path if that's gone. Yes drm fbdev helpers has this (but not enabled on most drivers because very, very few people care).Hmm.... Take my Laptop with a 4k (3180x2160) screen as an example: Lets say on average the half of every line is filled with text. So 3840/2*2160 pixels that change = 4147200 pixels. Every pixel takes 4 bytes = 16,588800 bytes per timer interrupt. In another Mail updating on vsync was mentioned, so multiply that by 60 and get ~927MB. And even if you only update the screen ony 4 times per second, that would be ~64MB of data. I'm likely missing something here.Since you say 4k it's a modern box, so you have on the order of 10GB/s of write bandwidth. And around 100MB/s of read bandwidth. Both from the cpu. It all adds up. It's that uncached read which kills you and means dmesg takes seconds to display. Also since this is 4k looking at sales volume we're talking integrated, so whether it's the gpu or the cpu that's doing the memcpy, it's the same memory bw budget you're burning down.
That might be true for integrated graphics, as said, i don't know the architecture. But saying it's good just because it's good on one architecture doesn't mean it's good for everyone. If you have an external GPU, than the memory/system bus BW would be different whether it's memcpy or the GPU doing the scrolling. And whether internal or external graphics - the CPU could do other stuff while the GPU scrolls stuff. Quite a lot of discussion for a revert of a patch that was already in the kernel for more than 20(?) years. /Sven