Thread (56 messages) 56 messages, 7 authors, 2011-10-17

Re: [PATCH] thp: tail page refcounting fix

From: Andrea Arcangeli <hidden>
Date: 2011-08-24 00:09:42
Also in: lkml

Hi Michel,

On Tue, Aug 23, 2011 at 12:52:56PM -0700, Michel Lespinasse wrote:
Adding Paul McKenney so he won't spend too much time on RCU cookie
feature until there is a firmer user...
Yep, he already knew because I notified him privately for the same
reason.
Looks like this scheme will work. I'm off in Yosemite for a few days
with my family, but I should be able to review this more thoroughly on
Thursday.
Take your time, and enjoy Yosemite :).
From a few-minutes look, I have a few minor concerns:
- When splitting THP pages, the old tail refcount will be visible as
the _mapcount for a short while after PageTail is cleared; not clear
yet to me if there are unintended side effects to that;
Well it was zero before and that was also wrong it is overwritten
later with the right value well after PageTail is cleared, so it's ok
if previous code was ok. All ptes are set as pmd_trans_splitting so
nothing should mess page_tail->_mapcount it as no mapping can be
created or go away for the duration of the split and regardless any
mapping that exists only exists for the pmd and the head page (tail
pages are invisible to rmap until later).
- (not a concern, but an opportunity) when splitting pages, there are
two atomic adds to the tail _count field, while we know the initial
value is 0. Why not just one straight assignment ? Similarly, the
adjustments to page head count could be added into a local variable
and the page head count could be updated once after all tail pages
have been split off.
That's an optimization I can look into agreed. I guess I just added
one line and not even think too much at optimizing this,
split_huge_page isn't in a fast path.
- Not sure if we could/should add assertions to make sure people call
the right get_page variant.
Not right now or it'd flood when anybody uses O_DIRECT. If O_DIRECT
gets fixes to stop doing this, it sounds definitely good idea.

I already tried adding a printk to the got=1 path and it floods with a
128M/sec dd bs=10M iflag=direct transfer.
The other question I have is about the use of the pagemap.h RCU
protocol for eventual page count stability. With your proposal, this
would now affect only head pages, so THP splitting is fine :) . I'm
not sure who else might use that protocol, but it looks like we should
either make all get_pages_unless_zero call sites follow it (if the
protocol matters to someone) or none (if the protocol turns out to be
obsolete).
I don't see who is using synchronize_rcu to stabilize the page count
so at first sight it seems superfluous there too. Maybe it was a "if
anybody will ever need to stabilize the page count this can be
used". The only calls of synchronize_rcu in mm/* are in memcg and in
mmu notifier which is not meant to synchronize the page count but just
to walk the mmu notifier registration list lockless from the mm
struct.

I guess we need to ask who wrote that function for clarifications on
the page count stabilization. And if one needs really to stabilize the
page count he will also need Paul's rcu_sequence_t feature to do it
really efficiently (which is now on hold, so if that synchronize_rcu
caller really exists that would likely also mean we need
rcu_sequence_t to optimize it properly). My current feeling is if one
needs that feature he's doing something's wrong that could be achieved
somewhere else but I may be biased by the fact this one worked out.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help