Thread (13 messages) 13 messages, 5 authors, 2021-10-13

Re: kvm crash in 5.14.1?

From: "Darrick J. Wong" <djwong@kernel.org>
Date: 2021-10-04 16:54:35
Also in: linux-fsdevel, linux-mm, lkml

On Thu, Sep 30, 2021 at 10:59:57AM -0700, Darrick J. Wong wrote:
On Wed, Sep 29, 2021 at 03:21:09PM +0000, Sean Christopherson wrote:
quoted
On Tue, Sep 28, 2021, Stephen wrote:
quoted
Hello,

I got this crash again on 5.14.7 in the early morning of the 27th.
Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
minutes.
...
quoted
BUG: kernel NULL pointer dereference, address: 0000000000000068
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G            E     5.14.7 #32
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
AORUS ELITE WIFI, BIOS F35 07/08/2021
RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
68 a0 a3 >
I haven't reproduced the crash, but the code signature (CMP against an absolute
address) is quite distinct, and is consistent across all three crashes.  I'm pretty
sure the issue is that page_is_secretmem() doesn't check for a null page->mapping,
e.g. if the page is truncated, which IIUC can happen in parallel since gup() doesn't
hold the lock.

I think this should fix the problems?
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 21c3771e6a56..988528b5da43 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
        mapping = (struct address_space *)
                ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);

-       if (mapping != page->mapping)
+       if (!mapping || mapping != page->mapping)
I'll roll this out on my vm host and try to re-run the mass fuzztest
overnight, though IT claims they're going to kill power to the whole
datacenter until Monday(!)...
...which they did, 30 minutes after I sent this email. :(

I'll hopefully be able to report back to the list in a day or two.

--D
--D
quoted
                return false;

        return mapping->a_ops == &secretmem_aops;
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help