Thread (69 messages) 69 messages, 9 authors, 2011-08-08

Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2011-07-21 22:53:01
Also in: lkml

On Thu, 2011-07-21 at 15:36 -0700, Andrew Morton wrote:
On Tue, 19 Jul 2011 14:29:22 +1000
Benjamin Herrenschmidt [off-list ref] wrote:
quoted
The futex code currently attempts to write to user memory within
a pagefault disabled section, and if that fails, tries to fix it
up using get_user_pages().

This doesn't work on archs where the dirty and young bits are
maintained by software, since they will gate access permission
in the TLB, and will not be updated by gup().

In addition, there's an expectation on some archs that a
spurious write fault triggers a local TLB flush, and that is
missing from the picture as well.

I decided that adding those "features" to gup() would be too much
for this already too complex function, and instead added a new
simpler fixup_user_fault() which is essentially a wrapper around
handle_mm_fault() which the futex code can call.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

Shan, can you test this ? It might not fix the problem
um, what problem.  There's no description here of the user-visible
effects of the bug hence it's hard to work out what kernel version(s)
should receive this patch.
Shan could give you an actual example (it was in the previous thread),
but basically, livelock as the kernel keeps trying and trying the
in_atomic op and never resolves it.
 
What kernel version(s) should receive this patch?
I haven't dug. Probably anything it applies on as far as we did that
trick of atomic + gup() for futex.
quoted
since I'm
starting to have the nasty feeling that you are hitting what is
somewhat a subtly different issue or my previous patch should
have worked (but then I might have done a stupid mistake as well)
but let us know anyway.
I assume that Shan reported the secret problem so I added the
reported-by to the changelog.
He did :-) Shan, care to provide a rough explanation of what you
observed ?

Also Russell confirmed that ARM should be affected as well.

Cheers,
Ben.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help