Thread (32 messages) 32 messages, 7 authors, 2017-08-26

Re: [PATCH v6 3/5] mm: introduce mmap3 for safely defining new mmap flags

From: Dan Williams <hidden>
Date: 2017-08-26 15:15:43
Also in: linux-api, linux-fsdevel, linux-xfs, lkml, nvdimm

On Sat, Aug 26, 2017 at 12:40 AM, Helge Deller [off-list ref] wrote:
* Dan Williams [off-list ref]:
quoted
On Fri, Aug 25, 2017 at 9:19 AM, Helge Deller [off-list ref] wrote:
quoted
On 25.08.2017 18:16, Kirill A. Shutemov wrote:
quoted
On Fri, Aug 25, 2017 at 09:02:36AM -0700, Christoph Hellwig wrote:
quoted
On Fri, Aug 25, 2017 at 06:58:03PM +0300, Kirill A. Shutemov wrote:
quoted
Not all archs are ready for this:

arch/parisc/include/uapi/asm/mman.h:#define MAP_TYPE    0x03            /* Mask for type of mapping */
arch/parisc/include/uapi/asm/mman.h:#define MAP_FIXED   0x04            /* Interpret addr exactly */
I'd be happy to say that we should not care about parisc for
persistent memory.  We'll just have to find a way to exclude
parisc without making life too ugly.
I don't think creapling mmap() interface for one arch is the right way to
go. I think the interface should be universal.

I may imagine MAP_DIRECT can be useful not only for persistent memory.
For tmpfs instead of mlock()?
On parisc we have
#define MAP_SHARED      0x01            /* Share changes */
#define MAP_PRIVATE     0x02            /* Changes are private */
#define MAP_TYPE        0x03            /* Mask for type of mapping */
#define MAP_FIXED       0x04            /* Interpret addr exactly */
#define MAP_ANONYMOUS   0x10            /* don't use a file */

So, if you need a MAP_DIRECT, wouldn't e.g.
#define MAP_DIRECT      0x08
be possible (for parisc, and others 0x04).
And if MAP_TYPE needs to include this flag on parisc:
#define MAP_TYPE        (0x03 | 0x08)  /* Mask for type of mapping */
The problem here is that to support new the mmap flags the arch needs
to find a flag that is guaranteed to fail on older kernels. Defining
MAP_DIRECT to 0x8 on parisc doesn't work because it will simply be
ignored on older parisc kernels.

However, it's already the case that several archs have their own
sys_mmap entry points. Those archs that can't follow the common scheme
(only parsic it seems) will need to add a new mmap syscall. I think
that's a reasonable tradeoff to allow every other architecture to add
this support with their existing mmap syscall paths.
I don't want other architectures to suffer just because of parisc.
But adding a new syscall just for usage on parisc won't work either,
because nobody will add code to call it then.
I don't understand this comment, if / when parisc gets around to
adding pmem and dax support why wouldn't libc grow support for the new
parisc mmap variant? Also, it's not just MAP_DIRECT you would also
need space for a MAP_SYNC flag.
quoted
That means MAP_DIRECT should be defined to MAP_TYPE on parisc until it
later defines an opt-in mechanism to a new syscall that honors
MAP_DIRECT as a valid flag.
I'd instead propose to to introduce an ABI breakage for parisc users
(which aren't many). Most parisc users update their kernel regularily
anyway, because we fixed so many bugs in the latest kernel.

With the following patch pushed down to the stable kernel series,
MAP_DIRECT will fail as expected on those kernels, while we can
keep parisc up with current developments regarding MAP_DIRECT.
The whole point is to avoid an ABI regression and the chance for false
positive results. We're immediately stuck if some application was
expecting 0x8 to be ignored, or conversely an application that
absolutely needs to rely on MAP_SYNC/MAP_DIRECT semantics assumes the
wrong result on a parisc kernel where they are ignored.

I have not seen any patches for parisc pmem+dax enabling so it seems
too early to worry about these "last mile" enabling features of
MAP_DIRECT and MAP_SYNC. In particular parisc doesn't appear to have
ARCH_ENABLE_MEMORY_HOTPLUG, so as far as I can see it can't yet
support the ZONE_DEVICE scheme that is a pre-requisite for MAP_DIRECT.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help