Re: [PATCH] net: phy: don't issue a module request if a driver is available
From: Francesco Valla <hidden>
Date: 2025-01-02 13:27:22
On Thursday, 2 January 2025 at 12:06:15 Russell King (Oracle) [off-list ref] wrote:
On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:quoted
Whenever a new PHY device is created, request_module() is called unconditionally, without checking if a driver for the new PHY is already available (either built-in or from a previous probe). This conflicts with async probing of the underlying MDIO bus and always throws a warning (because if a driver is loaded it _might_ cause a deadlock, if in turn it calls async_synchronize_full()).Why aren't any of the phylib maintainers seeing this warning? Where does the warning come from?
I'm not sure. For me, it was pretty easy to trigger. I'm on a BeaglePlay (AM62X) and I am working on boot time optimization and with different configurations for async probing. When the async probe of the Ethernet NIC (i.e.: am65-cpsw-nuss) runs, it in turn triggers the probe of the associated davincio_mdio device, which then brings me to the following WARNING: [ 0.287271] davinci_mdio 8000f00.mdio: davinci mdio revision 9.7, bus freq 1000000 [ 0.287574] ------------[ cut here ]------------ [ 0.287581] WARNING: CPU: 2 PID: 48 at /kernel/module/kmod.c:144 __request_module+0x19c/0x204 [ 0.287605] Modules linked in: [ 0.287619] CPU: 2 UID: 0 PID: 48 Comm: kworker/u16:2 Not tainted 6.12.4-00004-g89f77a9313d4-dirty #1 [ 0.287630] Hardware name: BeagleBoard.org BeaglePlay (DT) [ 0.287637] Workqueue: async async_run_entry_fn [ 0.287653] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.287663] pc : __request_module+0x19c/0x204 [ 0.287671] lr : __request_module+0x198/0x204 [ 0.287679] sp : ffff8000817b2e70 [ 0.287684] x29: ffff8000817b2ef0 x28: ffff000001b994b0 x27: 0000000000000000 [ 0.287698] x26: 0000000000000000 x25: ffff8000817b3277 x24: 0000000000000000 [ 0.287712] x23: ffff000001b99000 x22: 0000000000000000 x21: ffff0000038a8800 [ 0.287726] x20: 0000000000000001 x19: ffff800080d4dc18 x18: 0000000000000002 [ 0.287739] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 0.287753] x14: 0000000000000001 x13: 0000000000000001 x12: 0000000000000001 [ 0.287767] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000000000000 [ 0.287780] x8 : ffff8000817b2ee8 x7 : 0000000000000000 x6 : 0000000000000000 [ 0.287794] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000030 [ 0.287806] x2 : 0000000000000008 x1 : ffff8000800c66dc x0 : 0000000000000001 [ 0.287820] Call trace: [ 0.287821] omap_i2c 20000000.i2c: bus 0 rev0.12 at 400 kHz [ 0.287826] __request_module+0x19c/0x204 [ 0.287835] phy_request_driver_module+0x11c/0x17c [ 0.287849] phy_device_create+0x234/0x260 [ 0.287860] get_phy_device+0x78/0x154 [ 0.287870] fwnode_mdiobus_register_phy+0x11c/0x1d8 [ 0.287880] __of_mdiobus_parse_phys+0x174/0x2a0 [ 0.287890] __of_mdiobus_register+0x104/0x240 [ 0.287899] davinci_mdio_probe+0x284/0x44c [ 0.287909] platform_probe+0x68/0xc4 [ 0.287921] really_probe+0xbc/0x29c [ 0.287930] really_probe_debug+0x88/0x110 [ 0.287940] __driver_probe_device+0xbc/0xd4 [ 0.287948] driver_probe_device+0xd8/0x15c [ 0.287957] __device_attach_driver+0xb8/0x134 [ 0.287965] bus_for_each_drv+0x88/0xe8 [ 0.287974] __device_attach+0xa0/0x190 [ 0.287982] device_initial_probe+0x14/0x20 [ 0.287991] bus_probe_device+0xac/0xb0 [ 0.287998] device_add+0x588/0x74c [ 0.288011] of_device_add+0x44/0x60 [ 0.288027] of_platform_device_create_pdata+0x8c/0x120 [ 0.288039] of_platform_device_create+0x18/0x24 [ 0.288050] am65_cpsw_nuss_probe+0x670/0xcf4 [ 0.288062] platform_probe+0x68/0xc4 [ 0.288072] really_probe+0xbc/0x29c [ 0.288079] really_probe_debug+0x88/0x110 [ 0.288088] __driver_probe_device+0xbc/0xd4 [ 0.288096] driver_probe_device+0xd8/0x15c [ 0.288105] __driver_attach_async_helper+0x4c/0xb4 [ 0.288113] async_run_entry_fn+0x34/0xe0 [ 0.288124] process_one_work+0x148/0x288 [ 0.288137] worker_thread+0x2cc/0x3d4 [ 0.288147] kthread+0x110/0x114 [ 0.288159] ret_from_fork+0x10/0x20 [ 0.288171] ---[ end trace 0000000000000000 ]--- This is expected, as request_module() is not meant to be called from an async context: https://lore.kernel.org/lkml/20130118221227.GG24579@htj.dyndns.org/ (local) It should be noted that: - the davincio_mdio device is a child of the am65-cpsw-nuss device - the am65-cpsw-nuss driver is NOT marked with neither PROBE_PREFER_ASYNCHRONOUS nor PROBE_FORCE_SYNCHRONOUS and the behavior is being triggered specifying driver_async_probe=am65-cpsw-nuss on the command line.
quoted
+static bool phy_driver_exists(u32 phy_id) +{ + bool found = false; + struct phy_drv_node *node; + + down_read(&phy_drv_list_sem); + list_for_each_entry(node, &phy_drv_list, list) { + if (phy_id_compare(phy_id, node->drv->phy_id, node->drv->phy_id_mask)) { + found = true; + break; + } + } + up_read(&phy_drv_list_sem); + + return found; +} +Why do we need this, along with the associated additional memory allocations? What's wrong with bus_for_each_drv() which the core code provides?
Because I didn't think of it, but it would be much better. I'll refactor the logic using bus_for_each_drv() + a simple match callback. Thank you! Regards, Francesco