Re: kernel BUG at mm/huge_memory.c:212!
From: David Rientjes <rientjes@google.com>
Date: 2012-11-27 23:47:44
Also in:
lkml
On Tue, 27 Nov 2012, Jiri Slaby wrote:
Hi, I've hit BUG_ON(atomic_dec_and_test(&huge_zero_refcount)) in put_huge_zero_page right now. There are some "Bad rss-counter state" before that, but those are perhaps unrelated as I saw many of them in the previous -next. But even with yesterday's next I got the BUG. [ 7395.654928] BUG: Bad rss-counter state mm:ffff8800088289c0 idx:1 val:-1 [ 7417.652911] BUG: Bad rss-counter state mm:ffff880008829a00 idx:1 val:-1 [ 7423.317027] BUG: Bad rss-counter state mm:ffff8800088296c0 idx:1 val:-1 [ 7463.737596] BUG: Bad rss-counter state mm:ffff88000882ad80 idx:1 val:-2 [ 7486.462237] BUG: Bad rss-counter state mm:ffff880008829040 idx:1 val:-2 [ 7499.118560] BUG: Bad rss-counter state mm:ffff880008829040 idx:1 val:-2 [ 7507.000464] BUG: Bad rss-counter state mm:ffff880008828000 idx:1 val:-2 [ 7512.898902] BUG: Bad rss-counter state mm:ffff880008829380 idx:1 val:-2 [ 7522.299066] BUG: Bad rss-counter state mm:ffff8800088296c0 idx:1 val:-2 [ 7530.471048] BUG: Bad rss-counter state mm:ffff8800088296c0 idx:1 val:-2 [ 7597.602661] BUG: 'atomic_dec_and_test(&huge_zero_refcount)' is true! [ 7597.602683] ------------[ cut here ]------------ [ 7597.602711] kernel BUG at /l/latest/linux/mm/huge_memory.c:212! [ 7597.602732] invalid opcode: 0000 [#1] SMP [ 7597.602751] Modules linked in: vfat fat dvb_usb_dib0700 dib0090 dib7000p dib7000m dib0070 dib8000 dib3000mc dibx000_common microcode [ 7597.602811] CPU 1 [ 7597.602823] Pid: 1221, comm: java Not tainted 3.7.0-rc6-next-20121126_64+ #1698 To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M. [ 7597.602867] RIP: 0010:[<ffffffff8116839e>] [<ffffffff8116839e>] put_huge_zero_page+0x2e/0x30 [ 7597.602902] RSP: 0000:ffff8801a58cdd48 EFLAGS: 00010292 [ 7597.602921] RAX: 0000000000000038 RBX: ffff880183cc0d00 RCX: 0000000000000007 [ 7597.602944] RDX: 00000000000000b5 RSI: 0000000000000046 RDI: ffffffff81dc605c [ 7597.602967] RBP: ffff8801a58cdd48 R08: 746127203a475542 R09: 000000000000047b [ 7597.602990] R10: 6365645f63696d6f R11: 7365745f646e615f R12: 00007fd4b3e00000 [ 7597.603014] R13: 00007fd4b3dcc000 R14: ffff8801bdebab00 R15: 8000000001d94225 [ 7597.603037] FS: 00007fd4c7ebe700(0000) GS:ffff8801cbc80000(0000) knlGS:0000000000000000 [ 7597.603064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7597.603083] CR2: 00007fd4b3dcc498 CR3: 000000017d6bc000 CR4: 00000000000007e0 [ 7597.603106] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 7597.603129] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 7597.603152] Process java (pid: 1221, threadinfo ffff8801a58cc000, task ffff8801a4655be0) [ 7597.603178] Stack: [ 7597.603187] ffff8801a58cddc8 ffffffff8116b8d4 ffff8801a38cb000 ffff8801bdebab00 [ 7597.603219] ffff880183cc0d00 00000001a38cb067 ffffea0006cccb40 ffff8801a3911cf0 [ 7597.603250] 00000001b332d000 00007fd4b3c00000 ffff880183cc0d00 00007fd4b3dcc498 [ 7597.603282] Call Trace: [ 7597.603293] [<ffffffff8116b8d4>] do_huge_pmd_wp_page+0x7e4/0x900 [ 7597.603316] [<ffffffff81148755>] handle_mm_fault+0x145/0x330 [ 7597.603337] [<ffffffff81071e45>] __do_page_fault+0x145/0x480 [ 7597.603358] [<ffffffff810b42c5>] ? sched_clock_local+0x25/0xa0 [ 7597.603378] [<ffffffff810b4ec8>] ? __enqueue_entity+0x78/0x80 [ 7597.603400] [<ffffffff810d0efd>] ? sys_futex+0x8d/0x190 [ 7597.603420] [<ffffffff810721be>] do_page_fault+0xe/0x10 [ 7597.603440] [<ffffffff816b7c72>] page_fault+0x22/0x30 [ 7597.603458] Code: 66 90 f0 ff 0d c0 05 cf 00 0f 94 c0 84 c0 75 02 f3 c3 55 48 c7 c6 60 51 97 81 48 c7 c7 1a 82 94 81 48 89 e5 31 c0 e8 25 60 54 00 <0f> 0b 66 66 66 66 90 55 48 89 e5 53 48 83 ec 08 48 83 7e 08 00 [ 7597.603640] RIP [<ffffffff8116839e>] put_huge_zero_page+0x2e/0x30 [ 7597.603664] RSP <ffff8801a58cdd48> [ 7597.636299] ---[ end trace 241e96a56fc0cf87 ]--- [ 7612.907136] SysRq : Keyboard mode set to system default
Thanks for the report. Adding Kirill to the cc since this is from the huge zero page patchset sitting in next and is due to the refcounting on lazy allocation. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>