Re: [PATCH 13/40] autonuma: CPU follow memory algorithm
From: Rik van Riel <hidden>
Date: 2012-07-02 08:17:51
On 07/02/2012 03:56 AM, Nai Xia wrote:
On 2012a1'07ae??02ae?JPY 15:36, Rik van Riel wrote:quoted
On 06/30/2012 11:10 AM, Nai Xia wrote:quoted
Yes, pte_numa or pte_young works the same way and they both can answer the problem of "which pages were accessed since last scan". For LRU, it's OK, it's quite enough. But for numa balancing it's NOT.Getting LRU right may be much more important than getting NUMA balancing right. Retrieving wrongly evicted data from disk can be a million of times slower than fetching data from RAM, while the penalty for accessing a remote NUMA node is only 20% or so.quoted
We also should care about the hotness of the page sets, since if the workloads are complex we should NOT be expecting that "if this page is accessed once, then it's always in my CPU cache during the whole last scan interval". The difference between LRU and the problem you are trying to deal with looks so obvious to me, I am so worried that you are still messing them up :(For autonuma, it may be fine to have a lower likelyhood of obtaining an optimum result, because the penalty for getting it wrong is so much lower.I said, I am actually want to see some detailed analysis showing that this sampling is really playing an important role in benchmarks as it claims to be. Not a quick "lower likelyhood than optimum" conclusion..... Please, Rik, I know your points, you don't have to explain anymore. But I just cannot follow without research data.
What kind of data are you looking for? I have seen a lot of generic comments in your emails, and one gut feeling about Andrea's sampling algorithm, but I seem to have missed the details of exactly what you are looking for. Btw, I share your feeling that Andrea's sampling algorithm will probably not be able to distinguish between NUMA nodes that are very frequent users of a page, and NUMA nodes that use the same page much less frequently. However, I suspect that the penalty of getting it wrong will be fairly low, while the overhead of getting access frequency information will be prohibitively high. There is a reason nobody uses LRU nowadays, but a clock style algorithm instead. -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>