Thread (5 messages) 5 messages, 2 authors, 2017-10-03

Re: [Oops] memory hot-unplug results fault instruction address at /include/linux/list.h:104

From: Abdul Haleem <hidden>
Date: 2017-09-29 10:50:56
Also in: linuxppc-dev

On Wed, 2017-09-20 at 12:54 -0700, Kees Cook wrote:
On Wed, Sep 20, 2017 at 12:40 AM, Abdul Haleem
[off-list ref] wrote:
quoted
On Tue, 2017-09-12 at 12:11 +0530, abdul wrote:
quoted
Hi,

Memory hot-unplug on PowerVM LPAR running next-20170911 results in
Faulting instruction address: 0xc0000000002b56c4

which maps to the below code path:

0xc0000000002b56c4 is in __rmqueue (./include/linux/list.h:104).
99     * This is only for internal list manipulation where we know
100    * the prev/next entries already!
101    */
102   static inline void __list_del(struct list_head * prev, struct
list_head * next)
103   {
104           next->prev = prev;
105           WRITE_ONCE(prev->next, next);
106   }
107
108   /**
I see another kernel Oops when running transparent hugepages
de-fragmentation test.

And the faulty instruction address again pointing to same code line
0xc00000000026f9f4 is in compaction_alloc (./include/linux/list.h:104)

steps to recreate:
-----------------
1. Enable transparent hugepages ("always")
2. Turn off the defrag $ echo 0 > khugepaged/defrag
3. Write random to memory path
4. Set huge pages numbers
5. Turn on defrag $ echo 1 > khugepaged/defrag


new trace:
----------
Unable to handle kernel paging request for data at address
0x5deadbeef0000108
This looks like use-after-list-removal, that value appears to be LIST_POISON1.

Try enabling CONFIG_DEBUG_LIST to see if you get better details?
With above config enabled I see below messages and also call traces. But
no kernel Oops.

BUG: Bad page state in process drmgr  pfn:770c7
page:f000000001dc31c0 count:0 mapcount:0 mapping:f000000001dc31c8
index:0x1
flags: 0x33ffff800000000()
raw: 033ffff800000000 f000000001dc31c8 0000000000000001 00000000ffffffff
raw: 5deadbeef0000100 5deadbeef0000200 0000000000000000 0000000000000000
page dumped because: non-NULL mapping



-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help