Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

[RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Eric Munson <hidden> · 2008-07-28
[PATCH 5/5 V2] [PPC] Setup stack memory segment for hugetlb pages · Eric Munson <hidden> · 2008-07-28
[PATCH 4/5 V2] Build hugetlb backed process stacks · Eric Munson <hidden> · 2008-07-28
Re: [PATCH 4/5 V2] Build hugetlb backed process stacks · Dave Hansen <hidden> · 2008-07-28
[PATCH 2/5 V2] Add shared and reservation control to hugetlb_file_setup · Eric Munson <hidden> · 2008-07-28
[PATCH 1/5 V2] Align stack boundaries based on personality · Eric Munson <hidden> · 2008-07-28
Re: [PATCH 1/5 V2] Align stack boundaries based on personality · Dave Hansen <hidden> · 2008-07-28
[PATCH 3/5] Split boundary checking from body of do_munmap · Eric Munson <hidden> · 2008-07-28
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-07-28
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Eric B Munson <hidden> · 2008-07-28
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Andrew Morton <akpm@linux-foundation.org> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Eric B Munson <hidden> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Eric B Munson <hidden> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Andrew Morton <akpm@linux-foundation.org> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Andrew Morton <akpm@linux-foundation.org> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Christoph Lameter <hidden> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Andrew Morton <akpm@linux-foundation.org> · 2008-07-30
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-08-04
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-08-05
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-08-05
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-08-05
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-08-05
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-08-06
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-08-06
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-08-07
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Dave Hansen <hidden> · 2008-08-07
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-08-11
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Nick Piggin <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Andrew Morton <akpm@linux-foundation.org> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Nick Piggin <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Nick Piggin <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Mel Gorman <hidden> · 2008-07-31
Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks · Michael Ellerman <hidden> · 2008-07-31

From: Nick Piggin <hidden>
Date: 2008-07-31 11:52:20
Also in: linux-mm, lkml

On Thursday 31 July 2008 21:27, Mel Gorman wrote:

On (31/07/08 16:26), Nick Piggin didst pronounce:

quoted

I imagine it should be, unless you're using a CPU with seperate TLBs for
small and huge pages, and your large data set is mapped with huge pages,
in which case you might now introduce *new* TLB contention between the
stack and the dataset :)

Yes, this can happen particularly on older CPUs. For example, on my
crash-test laptop the Pentium III there reports

TLB and cache info:
01: Instruction TLB: 4KB pages, 4-way set assoc, 32 entries
02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries

Oh? Newer CPUs tend to have unified TLBs?

quoted

Also, interestingly I have actually seen some CPUs whos memory operations
get significantly slower when operating on large pages than small (in the
case when there is full TLB coverage for both sizes). This would make
sense if the CPU only implements a fast L1 TLB for small pages.

It's also possible there is a micro-TLB involved that only support small
pages.

That is the case on a couple of contemporary CPUs I've tested with
(although granted they are engineering samples, but I don't expect
that to be the cause)

quoted

So for the vast majority of workloads, where stacks are relatively small
(or slowly changing), and relatively hot, I suspect this could easily
have no benefit at best and slowdowns at worst.

I wouldn't expect an application with small stacks to request its stack
to be backed by hugepages either. Ideally, it would be enabled because a
large enough number of DTLB misses were found to be in the stack
although catching this sort of data is tricky.

Sure, as I said, I have nothing against this functionality just because
it has the possibility to cause a regression. I was just pointing out
there are a few possibilities there, so it will take a particular type
of app to take advantage of it. Ie. it is not something you would ever
just enable "just in case the stack starts thrashing the TLB".

quoted

But I'm not saying that as a reason not to merge it -- this is no
different from any other hugepage allocations and as usual they have to
be used selectively where they help.... I just wonder exactly where huge
stacks will help.

Benchmark wise, SPECcpu and SPEComp have stack-dependent benchmarks.
Computations that partition problems with recursion I would expect to
benefit as well as some JVMs that heavily use the stack (see how many docs
suggest setting ulimit -s unlimited). Bit out there, but stack-based
languages would stand to gain by this. The potential gap is for threaded
apps as there will be stacks that are not the "main" stack.  Backing those
with hugepages depends on how they are allocated (malloc, it's easy,
MAP_ANONYMOUS not so much).

Oh good, then there should be lots of possibilities to demonstrate it.

Thanks,
Nick

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help