Re: [PATCH 1/1] fbdev: hyperv_fb: iounmap() the correct memory when removing a device
From: Saurabh Singh Sengar <ssengar@linux.microsoft.com>
Date: 2025-02-13 03:06:51
Also in:
dri-devel, linux-fbdev, lkml
On Thu, Feb 13, 2025 at 01:35:22AM +0000, Michael Kelley wrote:
From: Saurabh Singh Sengar <ssengar@linux.microsoft.com> Sent: Monday, February 10, 2025 8:52 AMquoted
[snip]quoted
quoted
quoted
quoted
While we are at it, I want to mention that I also observed below WARN while removing the hyperv_fb, but that needs a separate fix. [ 44.111220] WARNING: CPU: 35 PID: 1882 at drivers/video/fbdev/core/fb_info.c:70 framebuffer_release+0x2c/0x40 < snip > [ 44.111289] Call Trace: [ 44.111290] <TASK> [ 44.111291] ? show_regs+0x6c/0x80 [ 44.111295] ? __warn+0x8d/0x150 [ 44.111298] ? framebuffer_release+0x2c/0x40 [ 44.111300] ? report_bug+0x182/0x1b0 [ 44.111303] ? handle_bug+0x6e/0xb0 [ 44.111306] ? exc_invalid_op+0x18/0x80 [ 44.111308] ? asm_exc_invalid_op+0x1b/0x20 [ 44.111311] ? framebuffer_release+0x2c/0x40 [ 44.111313] ? hvfb_remove+0x86/0xa0 [hyperv_fb] [ 44.111315] vmbus_remove+0x24/0x40 [hv_vmbus] [ 44.111323] device_remove+0x40/0x80 [ 44.111325] device_release_driver_internal+0x20b/0x270 [ 44.111327] ? bus_find_device+0xb3/0xf0Thanks for pointing this out. Interestingly, I'm not seeing this WARN in my experiments. What base kernel are you testing with? Are you testing on a local VM or in Azure? What exactly are you doing to create the problem? I've been doing unbind of the driver, but maybe you are doing something different. FWIW, there is yet another issue where after doing two unbind/bind cycles of the hyperv_fb driver, there's an error about freeing a non-existent resource. I know what that problem is, and it's in vmbus_drv.c. I'll be submitting a patch for that as soon as I figure out a clean fix. MichaelThis is on local Hyper-V. Kernel: 6.14.0-rc1-next-20250205+ I run below command to reproduce the above error: echo "5620e0c7-8062-4dce-aeb7-520c7ef76171" >/sys/bus/vmbus/devices/5620e0c7-8062-4dce-aeb7-520c7ef76171/driver/unbindquoted
When hvfb_remove is called I can see the refcount for framebuffer is 2 when , I expect it to be 1. After unregistering this framebuffer there is still 1 refcount remains, which is the reason for this WARN at the time of framebuffer_release. I wonder who is registering/using this extra framebuffer. Its not hyperv_drm or hyperv_fb IIUC. - SaurabhHere are more details about this WARN: Xorg opens `/dev/fb0`, which increases the framebuffer's reference count, as mentioned above. As a result, when unbinding the driver, this WARN is expected, indicating that the framebuffer is still in use. I am open to suggestion what could be the correct behavior in this case. There acan be two possible options: 1. Check the framebuffer reference count and prevent the driver from unbinding/removal. OR 2. Allow the driver to unbind while issuing this WARN. (Current scenario)quoted
From looking at things and doing an experiment, I think there's a 3rdoption, which gets rid of the of the WARN while still allowing the unbind. The experiment is to boot Linux in a Gen2 Hyper-V guest with both the Hyper-V FB and Hyper-V DRM modules removed. In this case, the generic EFI framebuffer driver (efifb) should get used. With this driver, a program can open /dev/fb0, and while it is open, unbind the efifb driver (which is in /sys/bus/platform/drivers/efi-framebuffer). Interestingly, there's no WARN generated. But when the hyperv_fb driver is loaded and used, the WARN *is* generated, as you observed. So I looked at the code for efifb. It does the framebuffer_release() call in a function that hyperv_fb doesn't have. Based on the comments in efifb.c, we need a similar function to handle the call to framebuffer_release(). And the efifb driver also does the iounmap() in that same function, which makes we wonder if the hyperv_fb driver should do similarly. It will need a little more analysis to figure that out. You found the bug. Do you want to work on fixing the hyperv_fb driver? And maybe the Hyper-V DRM driver needs the same fix. I haven't looked. Alternatively, if you are busy, I can work on the fix. Let me know your preference. Michael
Thanks for your analysis, its a good to know about fbib driver is not having this issue. We can take it as a reference. At the first look I see efib driver is having a fb_ops.fb_destroy function which gets called after put_fb_info (responsible for decrementing the ref count). Also it uses devm_register_framebuffer which handles the registration and unregister of framebuffer more gracefully. I will work on this. - Saurabh