Thread (21 messages) 21 messages, 6 authors, 2016-09-09

Re: [PATCH][v8] PM / hibernate: Verify the consistent of e820 memory map by md5 value

From: Pavel Machek <hidden>
Date: 2016-08-31 11:03:36
Also in: lkml

Hi!
quoted
quoted
There may be problems going forward, but whether or not they actually happen
depends on what the differences are.  So while an e820 mismatch indicates that
things may go wrong, it doesn't necessarily mean that they will.
Well "memory won't get corrupted right away" seems like good reason to
panic the machine ASAP.

You can flip some bits in memory, and it may not cause problems. Still
if you know some bits in memory were flipped, you'd better panic the
machine. Continuing is unsafe.

If you could guarantee that machine will panic down the line, and not
something worse, you'd be right.

But at least the case where there is _less_ memory available after
resume, kernel will write into BIOS reserved memory and bad things
will happen. Yes, it usually panics, but it is quite clear it could
corrupt memory, too.
That depends a good deal on what those ranges were reserved for.
There very well may not be anything vital in there.
Umm. Yes, you can also flip some bits in memory, and not hit anything
vital.
quoted
quoted
Also, that panic() may cause hibernation to stop working in a sort of hard and
nasty way where it used to work flawlessly previously and that would be a
regression, so not really acceptable.
Well, turning memory corruption bug into panic is an improvement, not
a regression.
Since we don't do anything about these problems today and presumably
people use hibernation on the affected systems, there are reasons to
think that the problem is not quite as grave as you're painting it.

But that aside, adding a panic() like in this patch isn't particularly
useful anyway, because it panics the restore kernel.  It is sufficient
to make arch_hibernation_header_restore() return an error to actually
fail the resume and cause the restore kernel to discard the image.
And that would preserve the information about the failure in the
kernel log at least.
I don't think people are using hibernation today on affected systems
they are getting random oopses/panics, that's how this thread started.

Anyway, I agree that failing the resume is preferable to panic().

Thanks and best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help