Thread (11 messages) 11 messages, 3 authors, 2014-12-08

Re: [PATCH] Revert "MIPS: Remove race window in page fault handling"

From: Lars Persson <lars.persson@axis.com>
Date: 2014-12-03 09:40:10
Also in: lkml

Hi Leonid

First let me describe the mechanism of this race condition, which was a
fault in the kernel's MIPS architecture code. Specifically in its
implementation of lazy dcache flushing. AFAIK, it would only hit on
systems where the pagein code path writes to the page from the CPU.

The order of calls is:
flush_dcache_page() (from the FS's readpage)
set_pte_at()
update_mmu_cache()

The thread number one has executed the set_pte_at() when thread number
two hits the same page. It finds a valid PTE and proceeds to execute
code from a page that is not yet flushed to the point of I/D coherency.
That flush would happen in update_mmu_cache().

My patch does increase number of cache flushes for CoW yes and there
could be an optimization opportunity by playing tricks with the pte_t to
include information about executability of the mapping. 

Reverting the patch is a big no-no, then we go back to a state of
undefined CPU behavior.

- Lars


On Wed, 2014-12-03 at 04:25 +0100, Leonid Yegoshin wrote:
This reverts commit 64f23ab30b1fec86b91a48bf1f088840e5299971
(commit 2a4a8b1e5d9d343e13ff22e19af7b353f7b52d6f in Linux 'master')

Unfortunately the original commit effectively kills Imagination MIPS CPUs
performance because it enforces page cache flush each time then PTE is changed
for our CPUs. Basically - each CoW, each process start, each library attachment
or detachment. And it happens even in "fully-coherent" systems (in MIPS sense -
we have non coherent I-cache). Last tendency for increasing page size to 16K-64K
even exaggerate the performance loss.

Original commit discussion says:

    Kernel call stacks showed one thread handling an illegal instruction
    exception while another thread was somewhere around the
    set_pte_at/update_mmu_cache calls for the same page.

If this is taken correct then it points to a major problem unrelated to MIPS -
if by chance a first thread hits the page just before set_pte_at then it should
get a non-present PTE: set_pte_at with _PAGE_PRESENT/_PAGE_VALID flags can be
used only on "disactivated" PTE, after pte_clear* functions, effectively -
non-present, non-valid. And application can get SIGSEGV. It is a major race
condition and kernel should not offer it to applications. And in my
understanding set_pte_at is used in cases then page is available after kernel
finishes page handling, at least in theory.

Test note: I ran MIPS 1004K with 8 VPEs on 4 cores around 1.5 month on SOAK test
and never seen that problem on 3.10 kernel.

Signed-off-by: Leonid Yegoshin <redacted>
 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help