Re: RTL8402 stops working after hibernate/resume
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: 2020-09-25 09:44:18
On 25.09.2020 10:54, Petr Tesarik wrote:
On Fri, 25 Sep 2020 09:30:37 +0200 Petr Tesarik [off-list ref] wrote:quoted
On Thu, 24 Sep 2020 22:12:24 +0200 Heiner Kallweit [off-list ref] wrote:quoted
On 24.09.2020 21:14, Petr Tesarik wrote:quoted
On Wed, 23 Sep 2020 11:57:41 +0200 Heiner Kallweit [off-list ref] wrote:quoted
On 03.09.2020 10:41, Petr Tesarik wrote:quoted
Hi Heiner, this issue was on the back-burner for some time, but I've got some interesting news now. On Sat, 18 Jul 2020 14:07:50 +0200 Heiner Kallweit [off-list ref] wrote:quoted
[...] Maybe the following gives us an idea: Please do "ethtool -d <if>" after boot and after resume from suspend, and check for differences.The register dump did not reveal anything of interest - the only differences were in the physical addresses after a device reopen. However, knowing that reloading the driver can fix the issue, I copied the initialization sequence from init_one() to rtl8169_resume() and gave it a try. That works! Then I started removing the initialization calls one by one. This exercise left me with a call to rtl_init_rxcfg(), which simply sets the RxConfig register. In other words, these is the difference between 5.8.4 and my working version:--- linux-orig/drivers/net/ethernet/realtek/r8169_main.c 2020-09-02 22:43:09.361951750 +0200 +++ linux/drivers/net/ethernet/realtek/r8169_main.c 2020-09-03 10:36:23.915803703 +0200@@ -4925,6 +4925,9 @@ clk_prepare_enable(tp->clk); + if (tp->mac_version == RTL_GIGA_MAC_VER_37) + RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); + if (netif_running(tp->dev)) __rtl8169_resume(tp);This is quite surprising, at least when the device is managed by NetworkManager, because then it is closed on wakeup, and the open method should call rtl_init_rxcfg() anyway. So, it might be a timing issue, or incorrect order of register writes.Thanks for the analysis. If you manually bring down and up the interface, do you see the same issue?I'm not quite sure what you mean, but if the interface is configured (and NetworkManager is stopped), I can do 'ip link set eth0 down' and then 'ip link set eth0 up', and the interface is fully functional.quoted
What is the value of RxConfig when entering the resume function?I added a dev_info() to rtl8169_resume(). First with NetworkManager active (i.e. interface down on suspend): [ 525.956675] r8169 0000:03:00.2: RxConfig after resume: 0x0002400f Then I re-tried with NetworkManager stopped (i.e. interface up on suspend). Same result: [ 785.413887] r8169 0000:03:00.2: RxConfig after resume: 0x0002400f I hope that's what you were asking for... Petr Trtl8169_resume() has been changed in 5.9, therefore the patch doesn't apply cleanly on older kernel versions. Can you test the following on a 5.9-rc version or linux-next?I tried installing 5.9-rc6, but it freezes hard at boot, last message is: [ 14.916259] libphy: r8169: probed
This doesn't necessarily mean that the r8169 driver crashes the system. Other things could run in parallel. It freezes w/o any message?
quoted
At this point, I suspect you're right that the BIOS is seriously buggy. Let me check if ASUSTek has released any update for this model.Hm, it took me about an hour wondering why I cannot flash the 314 update, but then I finally noticed that this was for X543, while mine is an X453... *sigh* So, I'm at BIOS version 214, released in 2015, and that's the latest version. There are some older versions available, but the BIOS Flash utility won't let me downgrade. Does it make sense to bisect the change that broke the driver for me, or should I rather dispose of this waste^Wlaptop in an environmentally friendly manner? I mean, would you eventually accept a workaround for a few machines with a broken BIOS?
If the workaround is small and there's little chance to break other stuff: then usually yes. If you can spend the effort to bisect the issue, this would be appreciated.
Petr T
Heiner