Re: [PATCH] [v2] mmc: sdhci-pci-gli: Improve Random 4K Read Performance of GL9763E
From: Renius Chen <hidden>
Date: 2021-08-10 04:23:24
Also in:
linux-pm, lkml
Hi, First I'd like to appreciate your time reading this mail. We had some issues with submitting a patch to MMC and the reviewer suggested us to look for some help from the PM mailing list. GL9763e is a PCIe card reader. During a sequence of random 4K reads, due to the long idle period time between read requests, GL9763e will enter ASPM L1 very frequently. Hence the performance of random 4K reads is very worse. We tried to enlarge the ASPM L1 entry delay to avoid GL9763e from entering ASPM L1 by the idle period time during 4K reads. But such an adjustment also affects other use cases. It will reduce the frequency of entering ASPM L1 under all conditions so that the battery life will be shorter. This will cause the PLT test to fail. So we develop a patch to balance the performance of 4K reads and the battery life. Our purpose is only to improve the performance of 4K reads, but not to affect any other use cases. First, we monitor the requests, when a sequence of 4K reads is performing, we'll modify the value of a vendor specified register in GL9763e to disable ASPM L1 by the GL9763e hardware. Then re-enable ASPM L1 after the 4K reads are finished. But MMC reviewers think such behaviors may not be suitable for a MMC host driver and believe that there may be some better ways to achieve our goals. So I'm here to ask for your advice. Do you have any ideas for this case? Are this scenario and ASPM related to runtime PM? In my cognition, the entering and exiting of ASPM L0s and L1 are pure hardware behaviors and not handled by software, they are different from suspend/resume and runtime PM and D0/D3, right? Thanks a lot. Best regards, Renius Adrian Hunter [off-list ref] 於 2021年8月4日 週三 下午2:26寫道:
On 19/07/21 12:26 pm, Renius Chen wrote:quoted
Adrian Hunter [off-list ref] 於 2021年7月16日 週五 下午6:27寫道:quoted
On 14/07/21 5:15 am, Renius Chen wrote:quoted
Hi Adrain, What do you think of this patch? Or do you have any ideas or suggestions about the modification for Ulf's comments?Perhaps try to define your power management requirements in terms of latencies instead of request size, and then take the issue to the power management mailing list and power management maintainers for suggestions. You will probably need to point out why runtime PM doesn't met your requirements.Hi Adrain, Thanks for your advice. Our purpose is only to improve the performance of 4K reads, and we hope that it doesn't affect any other use cases. If we look into the latencies, it may affect not only 4K reads but also some other use cases.I just meant that, if you present the problem to people on the power management mailing lists, you probably need to describe the problem at an engineering level, instead of describing your solution at a programming level.quoted
Behaviors of ASPM is controlled by circuits of hardware. Drivers only enable or disable ASPM or set some parameters for ASPM, and are not able to know when the device enters or exits the L0s/L1 state. So the PM part of drivers may not suit this case. This patch could be simply divided into two parts: 1. Monitor requests. 2. Set a vendor specific register of GL9763e. The part 2 is no problems we think. And Ulf thinks that the behaviors of part 1 should not be implemented in sdhci-pci-gli.c. Do you have any suggestions on where we can implement the monitoring? Thank you. Best regards, Reniusquoted
quoted
Thank you. Best regards, Renius Renius Chen [off-list ref] 於 2021年7月7日 週三 下午9:49寫道:quoted
Ulf Hansson [off-list ref] 於 2021年7月7日 週三 下午8:16寫道:quoted
[...]quoted
Thanks, I understand what you mean. I simply searched for the keyword "MMC_READ_MULTIPLE_BLOCK" in the drivers/mmc/host folder, and found that in some SD/MMC host controller driver codes such as alcor.c, cavium.c, ...etc, there are also behaviors for monitoring the request in their driver. What's the difference between theirs and ours?Those checks are there to allow the HWs to be supported properly.quoted
And if the code that monitors the requstes does not belong the driver, where should I implement the code and how to add some functions only for GL9763e in that place, in your opinion?Honestly, I am not sure what suits your use case best. So far we have used runtime PM with a default auto suspend timeout, in combination with dev PM Qos. In other words, run as fast as possible to complete the requests in the queue then go back to idle and enter a low power state. Clearly, that seems not to be sufficient for your use case, sorry.Yes, the runtime PM, auto suspend, and PM Qos are all about the suspend/resume behaviors of the system or related to power states such as D0/D3 of the device. But these are totally different from the ASPM L0s/L1 for link states. Entering/exiting the ASPM is pure hardware behavior on the link layer and is not handled by any codes in drivers/mmc/core or drivers/mmc/host. We'd like to try to modify the patch by your opinions, but we are also confused about what or where suits our use case best. So we wonder how to start the modification and may need some suggestions to deal with the work, sorry. Thank you. Best regards, Reniusquoted
Kind regards Uffe