Thread (30 messages) 30 messages, 8 authors, 2017-01-18

Re: [LSF/MM TOPIC] Un-addressable device memory and block/fs implications

From: Jerome Glisse <hidden>
Date: 2016-12-13 20:22:49
Also in: linux-fsdevel, linux-mm

On Tue, Dec 13, 2016 at 12:01:04PM -0800, James Bottomley wrote:
On Tue, 2016-12-13 at 13:55 -0500, Jerome Glisse wrote:
quoted
On Tue, Dec 13, 2016 at 10:20:52AM -0800, James Bottomley wrote:
quoted
On Tue, 2016-12-13 at 13:15 -0500, Jerome Glisse wrote:
quoted
I would like to discuss un-addressable device memory in the
context 
of filesystem and block device. Specificaly how to handle write
-back,
read, ... when a filesystem page is migrated to device memory
that 
CPU can not access.

I intend to post a patchset leveraging the same idea as the
existing
block bounce helper (block/bounce.c) to handle this. I believe
this 
is worth discussing during summit see how people feels about such
plan and if they have better ideas.
Isn't this pretty much what the transcendent memory interfaces we
currently have are for?  It's current use cases seem to be
compressed
swap and distributed memory, but there doesn't seem to be any
reason in
principle why you can't use the interface as well.
I am not a specialist of tmem or cleancache
Well, that makes two of us; I just got to sit through Dan Magenheimer's
talks and some stuff stuck.
quoted
 but my understand is that there is no way to allow for file back 
page to be dirtied while being in this special memory.
Unless you have some other definition of dirtied, I believe that's what
an exclusive tmem get in frontswap actually does.  It marks the page
dirty when it comes back because it may have been modified.
Well frontswap only support anonymous or share page, not random filemap
page. So it doesn't help for what i am aiming at :) Note that in my case
the device report accurate dirty information (did the device modified
the page or not) assuming hardware bugs doesn't exist.

quoted
In my case when you migrate a page to the device it might very well 
be so that the device can write something in it (results of some sort 
of computation). So page might migrate to device memory as clean but
return from it in dirty state.

Second aspect is that even if memory i am dealing with is un
-addressable i still have struct page for it and i want to be able to 
use regular page migration.
Tmem keeps a struct page ... what's the problem with page migration?
the fact that tmem locks the page when it's not addressable and you
want to be able to migrate the page even when it's not addressable?
Well the way cleancache or frontswap works is that they are use when
kernel is trying to make room or evict something. In my case it is the
device that trigger the migration for a range of virtual address of a
process. Sure i can make a weird helper that would force to frontswap
or cleancache pages i want to migrate but it seems counter intuitive
to me.

One extra requirement for me is to be able to easily and quickly find
the migrated page by looking at the CPU page table of the process.
With frontswap it adds a level of indirection where i need to find
through frontswap the memory. With cleancache there isn't even any
information left (the page table entry is cleared).

quoted
So given my requirement i didn't thought that cleancache was the way
to address them. Maybe i am wrong.
I'm not saying it is, I just asked if you'd considered it, since the
requirements look similar.
Yes i briefly consider it but from the highlevel overview i had it did
not seems to address all my requirement. Maybe it is because i lack
in depth knowledge of cleancache/frontswap but skiming through code
didn't convince me that i needed to dig deeper.

The solution i am pursuing use struct page and thus everything is as
if it was regular page to the kernel. The only thing that doesn't work
is kmap or mapping it into a process. But this can easily be handled.
For filesystem issues are about anything that do I/O so read/write/
writeback.

In many case if CPU I/O happens what i want to do is migrate back to a
regular page, so the read/write case is easy. But for writeback if page
is dirty on the device and device reports it (calling set_page_dirty())
then i still want to have writeback to work so i don't loose data (if
device dirtied the page it is probably because it was instructed to
save current computations).

With this in mind, the bounce helper design to work around block device
limitation in respect to page they can access seemed to be a perfect fit.
All i care about is providing a bounce page allowing writeback to happen
without having to go through the "slow" page migration back to system
page.

J�r�me
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help