Re: OOM kills with lots of free swap
From: Luigi Semenzato <hidden>
Date: 2017-06-29 17:46:40
Well, my apologies, I haven't been able to reproduce the problem, so there's nothing to go on here. We had a bug (a local patch) which caused this, then I had a bug in my test case, so I was confused. I also have a recollection of this happening in older kernels (3.8 I think), but I am not going to go back that far since even if the problem exists, we have no evidence it happens frequently. Thanks! On Tue, Jun 27, 2017 at 8:50 AM, Michal Hocko [off-list ref] wrote:
On Tue 27-06-17 08:22:36, Luigi Semenzato wrote:quoted
(sorry, I forgot to turn off HTML formatting) Thank you, I can try this on ToT, although I think that the problem is not with the OOM killer itself but earlier---i.e. invoking the OOM killer seems unnecessary and wrong. Here's the question. The general strategy for page allocation seems to be (please correct me as needed): 1. look in the free lists 2. if that did not succeed, try to reclaim, then try again to allocate 3. keep trying as long as progress is made (i.e. something was reclaimed) 4. if no progress was made and no pages were found, invoke the OOM killer.Yes that is the case very broadly speaking. The hard question really is what "no progress" actually means. We use "no pages could be reclaimed" as the indicator. We cannot blow up at the first such instance of course because that could be too early (e.g. data under writeback and many other details). With 4.7+ kernels this is implemented in should_reclaim_retry. Prior to the rework we used to rely on zone_reclaimable which simply checked how many pages we have scanned since the last page has been freed and if that is 6 times the reclaimable memory then we simply give up. It had some issues described in 0a0337e0d1d1 ("mm, oom: rework oom detection").quoted
I'd like to know if that "progress is made" notion is possibly buggy. Specifically, does it mean "progress is made by this task"? Is it possible that resource contention creates a situation where most tasks in most cases can reclaim and allocate, but one task randomly fails to make progress?This can happen, alhtough it is quite unlikely. We are trying to throttle allocations but you can hardly fight a consistent badluck ;) In order to see what is going on in your particular case we need an oom report though. -- Michal Hocko SUSE Labs
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>