Re: [PATCH] Use a spinlock to guard `fep->ptp_clk_on`
From: Csókás Bence <hidden>
Date: 2022-09-01 07:52:54
On 2022. 08. 31. 18:24, Andrew Lunn wrote: >>> Please keep to reverse christmas tree >> >> checkpatch didn't tell me that was a requirement... Easy to fix though. > > checkpatch does not have the smarts to detect this. And it is a netdev > only thing. I thought checkpatch also has the per-subsystem rules, too. > There is also a different between not being able to sleep, and not > being able to process an interrupt for some other hardware. You got a > report about taking a mutex in atomic context. That just means you > cannot sleep, probably because a spinlock is held. That is very > different to not being able to handle interrupts. You only need > spin_lock_irqsave() if the interrupt handler also needs the same spin > lock to protect it from a thread using the spin lock. Alright, I will switch to plain `spin_lock()` then. On 2022. 08. 31. 19:12, Francesco Dolcini wrote:
On Wed, Aug 31, 2022 at 02:56:31PM +0200, Csókás Bence wrote:quoted
Mutexes cannot be taken in a non-preemptible context, causing a panic in `fec_ptp_save_state()`. Replacing `ptp_clk_mutex` by `ptp_clk_lock` fixes this.I would probably add the kernel BUG trace to the commit message.quoted
Fixes: 91c0d987a9788dcc5fe26baafd73bf9242b68900 Fixes: 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 Fixes: f79959220fa5fbda939592bf91c7a9ea90419040Just Fixes: f79959220fa5 ("fec: Restart PPS after link state change >quoted
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>Is this https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ (local) the report?
Yes, precisely.
I just stumbled on the same issue on latest torvalds 6.0-rc3.
[ 22.718465] =============================
[ 22.725431] [ BUG: Invalid wait context ]
[ 22.732439] 6.0.0-rc3-00007-gdcf8e5633e2e #1 Tainted: G W
[ 22.742278] -----------------------------
[ 22.749217] kworker/3:1/45 is trying to lock:
[ 22.757157] c211b71c (&fep->ptp_clk_mutex){+.+.}-{3:3}, at: fec_ptp_gettime+0x30/0xcc
[ 22.770814] other info that might help us debug this:
[ 22.778718] context-{4:4}
[ 22.784065] 4 locks held by kworker/3:1/45:
[ 22.790949] #0: c20072a8 ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x1e4/0x730
[ 22.806494] #1: e6a15f18 ((work_completion)(&(&dev->state_queue)->work)){+.+.}-{0:0}, at: process_one_work+0x1e4/0x730
[ 22.822744] #2: c287a478 (&dev->lock){+.+.}-{3:3}, at: phy_state_machine+0x34/0x228
[ 22.835884] #3: c211b2a4 (&dev->tx_global_lock){+...}-{2:2}, at: netif_tx_lock+0x10/0x1cThank you, this lock was the source of the problem!
Francesco
Bence