Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
From: Tsuchiya Yuto <hidden>
Date: 2020-11-26 18:31:14
Also in:
linux-wireless, lkml
On Fri, 2020-11-20 at 12:53 -0800, Brian Norris wrote:
(Sorry if anything's a bit slow here. I don't really have time to write out full proposals myself.)
Don’t worry, I (and other owners) am already glad to hear from upstream devs, including you :) (and I'm also sorry for the late reply from me. It was difficult to find spare time these days.)
On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto [off-list ref] wrote:quoted
Let me know if splitting this patch like this works. 1) The first patch is to add this module parameter but don't change the default behavior.That *could* be OK with me, although it's already been said that there are many people who dislike extra module parameters. I also don't see why this needs to be a module parameter at all. If you do #2 right, you don't really need this, as there are already several standard ways of doing this (e.g., via Kconfig, or via nl80211 on a per-device basis).quoted
2) The second patch is to change the parameter value depending on the DMI matching or something so that it doesn't break the existing users.Point 2 sounds good, and this is the key point. Note that you can do point 2 without making it a module parameter. Just keep a flag in the driver-private structures.
I chose to also provide the module parameter because I thought it would be a little bit complicated for users to re-enable this dump feature if I chose not to provide this parameter. If I don't provide the parameter but still want to allow users to re-enable this dump feature, we can provide a way to change the bit flags of quirks (via a new debugfs entry or another module parameter). That said, is there a way to easily change the quirk flags only for this dump feature? For example, assume that the following three quirks are enabled by default: /* quirks */ #define QUIRK_FW_RST_D3COLD BIT(0) #define QUIRK_NO_DEVICE_DUMP BIT(1) #define QUIRK_FOO BIT(2) .driver_data = (void *)(QUIRK_FW_RST_D3COLD | QUIRK_NO_DEVICE_DUMP | QUIRK_FOO), card->quirks = (uintptr_t)dmi_id->driver_data; /* and assume that card->quirks can be changed by users by a file named * "quirks" under debugfs. */ So, the card->quirks will be "7" in decimal by default. Then, if users want to re-enable the dump feature, as far as I know, users need to know the default value "7", then need to know that device_dump is controlled by bit 1. Finally, users can set the new quirk flags "5" in decimal (101 in binary). echo 5 > /sys/kernel/debug/mwifiex/mlan0/quirks I'm glad if there is another nice way to control only the device_dump quirk flag, if we only provide a way to change quirk flags. But at the same time I also think that if someone dare to want to re-enable this feature, maybe the person won't feel it's complicated haha. So, I'll drop the device_dump module parameter and switch to use the quirk framework, adding a way to change the quirk flags value by users. That said, we may even drop disabling the dump. Take a look at my comment I wrote below.
quoted
But what I want to say here as well is that, if the firmware can be fixed, we don't need a patch like this.Sure. That's also where we don't necessarily need more ways to control this from user space (e.g., module parameters), but just better detection of currently broken systems (in the driver). And if firmware ever gets fixed, we can undo the "broken device" detection.
There are two types of firmware crashes we observes: 1) cmd_timedout (other than ps_mode-related) 2) Firmware wakeup failed (ps_mode-related) The #2 is observed when we enabled ps_mode. The #1 is observed for the other causes. And hopefully, verdre (in Cc) found a "fix" [1] for the #1 fw crash. We are trying the fix now. The pull req (still WIP) basically addresses fw crashing by using "non-posted" register writes and uninterruptible wait queue for PCI operations in the kernel driver side. We still don't have any ideas to "fix" the #2 fw crash, but now we don't see the #1 crash anymore so far. I'd like to see where the attempt goes before I start working on this patch here again. Thank you, everyone. [1] https://github.com/linux-surface/kernel/pull/70 [WIP] Properly fix wifi firmware crashes by jonas2515 · Pull Request #70 · linux-surface/kernel
Brian