Re: [PATCH net-next v1 1/1] net: dsa: microchip: ksz9477: export HW stats over stats64 interface
From: Vladimir Oltean <olteanv@gmail.com>
Date: 2022-02-18 10:43:42
Also in:
lkml
On Fri, Feb 18, 2022 at 09:39:56AM +0100, Oleksij Rempel wrote:
On Thu, Feb 17, 2022 at 05:55:54PM +0200, Vladimir Oltean wrote:quoted
On Thu, Feb 17, 2022 at 03:07:26PM +0100, Oleksij Rempel wrote:quoted
+static void ksz9477_get_stats64(struct dsa_switch *ds, int port, + struct rtnl_link_stats64 *stats) +{ + struct ksz_device *dev = ds->priv; + struct ksz9477_stats_raw *raw; + struct ksz_port_mib *mib; + int ret; + + mib = &dev->ports[port].mib; + raw = (struct ksz9477_stats_raw *)mib->counters; + + mutex_lock(&mib->cnt_mutex);The eternal problem, ndo_get_stats64 runs in atomic context, mutex_lock() sleeps. Please test your patches with CONFIG_DEBUG_ATOMIC_SLEEP=y.Good point, thx! I reworked the code. Beside, I get this warning with differnt locking validators: [ 153.140000] br0: port 1(lan2) entered blocking state [ 153.190000] br0: port 1(lan2) entered disabled state [ 153.320000] device lan2 entered promiscuous mode [ 153.350000] ------------[ cut here ]------------ [ 153.350000] WARNING: CPU: 0 PID: 71 at net/core/dev.c:7913 __dev_set_promiscuity+0x10c/0x138 [ 153.360000] RTNL: assertion failed at net/core/dev.c (7913) [ 153.370000] Modules linked in: atmel_aes atmel_tdes atmel_sha [ 153.370000] CPU: 0 PID: 71 Comm: kworker/u2:5 Not tainted 5.17.0-rc2-00714-g845f6fa17e48-dirty #33 [ 153.370000] Hardware name: Atmel SAMA5 [ 153.370000] Workqueue: dsa_ordered dsa_slave_switchdev_event_work [ 153.370000] unwind_backtrace from show_stack+0x18/0x1c [ 153.370000] show_stack from dump_stack_lvl+0x58/0x70 [ 153.370000] dump_stack_lvl from __warn+0xd8/0x228 [ 153.370000] __warn from warn_slowpath_fmt+0x98/0xc8 [ 153.370000] warn_slowpath_fmt from __dev_set_promiscuity+0x10c/0x138 [ 153.370000] __dev_set_promiscuity from __dev_set_rx_mode+0x8c/0x98 [ 153.370000] __dev_set_rx_mode from dev_uc_add+0x84/0x8c [ 153.370000] dev_uc_add from dsa_port_host_fdb_add+0x48/0x80 [ 153.370000] dsa_port_host_fdb_add from dsa_slave_switchdev_event_work+0x1dc/0x254 [ 153.370000] dsa_slave_switchdev_event_work from process_one_work+0x2b0/0x7d4 [ 153.370000] process_one_work from worker_thread+0x4c/0x53c [ 153.370000] worker_thread from kthread+0xf8/0x12c [ 153.370000] kthread from ret_from_fork+0x14/0x2c [ 153.370000] Exception stack(0xc1f13fb0 to 0xc1f13ff8) [ 153.370000] 3fa0: 00000000 00000000 00000000 00000000 [ 153.370000] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 153.370000] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 153.380000] irq event stamp: 0 [ 153.390000] hardirqs last enabled at (0): [<00000000>] 0x0 [ 153.390000] hardirqs last disabled at (0): [<c0124b38>] copy_process+0x7d8/0x194c [ 153.400000] softirqs last enabled at (0): [<c0124b38>] copy_process+0x7d8/0x194c [ 153.410000] softirqs last disabled at (0): [<00000000>] 0x0 [ 153.410000] ---[ end trace 0000000000000000 ]--- [ 153.420000] device eth0 entered promiscuous mode [ 153.770000] ksz9477-switch spi1.0 lan2: configuring for phy/gmii link mode [ 155.040000] ksz9477-switch spi1.0 lan4: Link is Down [ 156.960000] ksz9477-switch spi1.0 lan2: Link is Up - 1Gbps/Full - flow control rx/tx Is it something known?
Thanks for reporting. It isn't something known (at least to me - who knows whether somebody may have been chuckling while I was gloating that I removed rtnl_lock() from the DSA switchdev FDB notifiers). It turns out that dsa_port_host_fdb_add() needs rtnl_lock() around dev_uc_add(), at least if the master doesn't support IFF_UNICAST_FLT. It's pretty nasty. If we add an rtnl_lock() there, we'll deadlock from the code paths that call dsa_flush_workqueue(), which have rtnl_lock() already held. The only way I think we can get out of this is if we call dev_uc_add() only if the master has IFF_UNICAST_FLT. If it doesn't, we should force a call to dsa_master_set_promiscuity() during dsa_master_setup(). Not exactly the cleanest solution, but at least it shouldn't deadlock and it should work. Andrew, Florian, Vivien?