Thread (6 messages) 6 messages, 3 authors, 2013-06-07

[PATCH] clk: remove the clk_notifier from clk_notifier_list before free it (was: Re: [BUG] zynq | CCF | SRCU)

From: Sören Brinkmann <hidden>
Date: 2013-06-03 16:50:05
Also in: lkml

Hi Lai,

On Mon, Jun 03, 2013 at 05:17:15PM +0800, Lai Jiangshan wrote:
On 06/01/2013 03:12 AM, S?ren Brinkmann wrote:
quoted
Hi,

we recently encountered some kernel panics when we compiled one of our
drivers as module and tested inserting/removing the module.
Trying to debug this issue, I could reproduce it on the mainline kernel
with a dummy module.

What happens is, that when on driver remove clk_notifier_unregister() is
called and no other notifier for that clock is registered, the kernel
panics.
I'm not sure what is going wrong here. If there is a bug (and if where)
or I'm just using the infrastructure the wrong way,... So, any hint is
appreciated.

I attach the output from the crashing system. The stacktrace indicates a
crash in 'srcu_readers_seq_idx()'.
I also attach the module I used to trigger the issue and a patch on top
of mainline commit a93cb29acaa8f75618c3f202d1cf43c231984644 which has
the DT modifications I need to make the module find its clock and boot
with my initramfs.


	Thanks,
	S?ren


Hi, S?ren Brinkmann

I guess:

modprobe clk_notif_dbg
modprobe clk_notif_dbg -r  # memory corrupt here
modprobe clk_notif_dbg     # access corrupted memroy, but no visiable bug
modprobe clk_notif_dbg -r  # access corrupted memroy, BUG


How the first "modprobe clk_notif_dbg -r" corrupt memroy:

=========

int clk_notifier_unregister(struct clk *clk, struct notifier_block *nb)
{
	struct clk_notifier *cn = NULL;
	int ret = -EINVAL;

	if (!clk || !nb)
		return -EINVAL;

	clk_prepare_lock();

	list_for_each_entry(cn, &clk_notifier_list, node)
		if (cn->clk == clk)
			break;

	if (cn->clk == clk) {
		ret = srcu_notifier_chain_unregister(&cn->notifier_head, nb);

		clk->notifier_count--;

		/* XXX the notifier code should handle this better */
		if (!cn->notifier_head.head) {
			srcu_cleanup_notifier_head(&cn->notifier_head);

===========> the code forgot to remove @cn from the clk_notifier_list
===========> the second "modprobe clk_notif_dbg" will the same @clk and use the same corrupt @cn

			kfree(cn);
		}

	} else {
		ret = -ENOENT;
	}

	clk_prepare_unlock();

	return ret;
}

===========

Could you retry with the following patch?
Thanks for the patch. This fixes it. I can add/remove my module, and it
didn't crash since.
I had some trouble applying it though, due to some encoding hiccups. I
think the '?' in my name might be the culprit. I never know where things
go wrong, whether it's format-patch, or something in the email
transport, the sender or receiver side... But well, looks like full
UTF-8 support is missing somewhere.
Thanks,
Lai

From 5e26b626724139070148df9f6bd0607bc7bc3812 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <redacted>
Date: Mon, 3 Jun 2013 16:59:50 +0800
Subject: [PATCH] clk: remove the clk_notifier from clk_notifier_list before
 free it
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The @cn is stay in @clk_notifier_list after it is freed, it cause
memory corruption.

Example, if @clk is registered(first), unregistered(first),
registered(second), unregistered(second).

The freed @cn will be used when @clk is registered(second),
and the bug will be happened when @clk is unregistered(second):

[  517.040000] clk_notif_dbg clk_notif_dbg.1: clk_notifier_unregister()
[  517.040000] Unable to handle kernel paging request at virtual address 00df3008
[  517.050000] pgd = ed858000
[  517.050000] [00df3008] *pgd=00000000
[  517.060000] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[  517.060000] Modules linked in: clk_notif_dbg(O-) [last unloaded: clk_notif_dbg]
[  517.060000] CPU: 1 PID: 499 Comm: modprobe Tainted: G           O 3.10.0-rc3-00119-ga93cb29-dirty #85
[  517.060000] task: ee1e0180 ti: ee3e6000 task.ti: ee3e6000
[  517.060000] PC is at srcu_readers_seq_idx+0x48/0x84
[  517.060000] LR is at srcu_readers_seq_idx+0x60/0x84
[  517.060000] pc : [<c0052720>]    lr : [<c0052738>]    psr: 80070013
[  517.060000] sp : ee3e7d48  ip : 00000000  fp : ee3e7d6c
[  517.060000] r10: 00000000  r9 : ee3e6000  r8 : 00000000
[  517.060000] r7 : ed84fe4c  r6 : c068ec90  r5 : c068e430  r4 : 00000000
[  517.060000] r3 : 00df3000  r2 : 00000000  r1 : 00000002  r0 : 00000000
[  517.060000] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  517.060000] Control: 18c5387d  Table: 2d85804a  DAC: 00000015
[  517.060000] Process modprobe (pid: 499, stack limit = 0xee3e6238)
[  517.060000] Stack: (0xee3e7d48 to 0xee3e8000)
....
[  517.060000] [<c0052720>] (srcu_readers_seq_idx+0x48/0x84) from [<c0052790>] (try_check_zero+0x34/0xfc)
[  517.060000] [<c0052790>] (try_check_zero+0x34/0xfc) from [<c00528b0>] (srcu_advance_batches+0x58/0x114)
[  517.060000] [<c00528b0>] (srcu_advance_batches+0x58/0x114) from [<c0052c30>] (__synchronize_srcu+0x114/0x1ac)
[  517.060000] [<c0052c30>] (__synchronize_srcu+0x114/0x1ac) from [<c0052d14>] (synchronize_srcu+0x2c/0x34)
[  517.060000] [<c0052d14>] (synchronize_srcu+0x2c/0x34) from [<c0053a08>] (srcu_notifier_chain_unregister+0x68/0x74)
[  517.060000] [<c0053a08>] (srcu_notifier_chain_unregister+0x68/0x74) from [<c0375a78>] (clk_notifier_unregister+0x7c/0xc0)
[  517.060000] [<c0375a78>] (clk_notifier_unregister+0x7c/0xc0) from [<bf008034>] (clk_notif_dbg_remove+0x34/0x9c [clk_notif_dbg])
[  517.060000] [<bf008034>] (clk_notif_dbg_remove+0x34/0x9c [clk_notif_dbg]) from [<c02bb974>] (platform_drv_remove+0x24/0x28)
[  517.060000] [<c02bb974>] (platform_drv_remove+0x24/0x28) from [<c02b9bf8>] (__device_release_driver+0x8c/0xd4)
[  517.060000] [<c02b9bf8>] (__device_release_driver+0x8c/0xd4) from [<c02ba680>] (driver_detach+0x9c/0xc4)
[  517.060000] [<c02ba680>] (driver_detach+0x9c/0xc4) from [<c02b99c4>] (bus_remove_driver+0xcc/0xfc)
[  517.060000] [<c02b99c4>] (bus_remove_driver+0xcc/0xfc) from [<c02bace4>] (driver_unregister+0x54/0x78)
[  517.060000] [<c02bace4>] (driver_unregister+0x54/0x78) from [<c02bbb44>] (platform_driver_unregister+0x1c/0x20)
[  517.060000] [<c02bbb44>] (platform_driver_unregister+0x1c/0x20) from [<bf0081f8>] (clk_notif_dbg_driver_exit+0x14/0x1c [clk_notif_dbg])
[  517.060000] [<bf0081f8>] (clk_notif_dbg_driver_exit+0x14/0x1c [clk_notif_dbg]) from [<c00835e4>] (SyS_delete_module+0x200/0x28c)
[  517.060000] [<c00835e4>] (SyS_delete_module+0x200/0x28c) from [<c000edc0>] (ret_fast_syscall+0x0/0x48)
[  517.060000] Code: e5973004 e7911102 e0833001 e2881002 (e7933101)

CC: stable at kernel.org
Reported-by: S?ren Brinkmann <redacted>
Signed-off-by: Lai Jiangshan <redacted>
Tested-by: S?ren Brinkmann <redacted>

	S?ren
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help