Thread (42 messages) 42 messages, 8 authors, 2012-06-22

Re: [PATCH 2/3] ext4: Context support

From: Arnd Bergmann <hidden>
Date: 2012-06-13 19:44:51
Also in: linux-fsdevel, linux-mmc

On Tuesday 12 June 2012, Ted Ts'o wrote:
On Tue, Jun 12, 2012 at 08:07:28PM +0000, Arnd Bergmann wrote:
quoted
Right. The danger here is that the context support was described in
the standard first, while none of the devices seem to even be
smart enough to make use of the information we put in there. Once
operating systems start putting some data in there, at least
some manufacturers will start making use of that data to optimize
the accesses, but it's very unlikely that they will tell us exactly
what they are doing. Having code in ext4 that uses the contexts will
at least make it more likely that the firmware optimizations are
based on ext4 measurements rather than some other file system or
operating system.

From talking with the emmc device vendors, I can tell you that ext4
is very high on the list of file systems to optimize for, because
they all target Android products.
Well, I have a contact at SanDisk where I can discuss things under
NDA, if that will help.  He had reached out to me specifically because
of ext4 and Android --- he's the guy that I invited to give a talk at
the LSF workshop last year.
Well, the Linaro storage team is in close contact with Alex Lemberg
from Sandisk, Luca Porzio from Micron and Hyojin Jeong from Samsung,
and we discussed this patch in our meeting two weeks ago and on
our Linaro mailing lists before that.

I have a good feeling about that work relationship, and they
all understand the needs of the Linux file systems, but my impression
is also that with an NDA in place we would not be able to put any
better implementation into the Linux kernel that makes use of hw
details of one of the manufacturers. Also note that the eMMC standard
is intentionally written in an abstract way to give the hardware
manufacturers the option to provide better implementations over time,
e.g. when new devices start using large amounts of cache, or replace
NAND flash with phase change memory or other technologies.

That said, I think it is rather clear what the authors of the spec
had in mind, and there is only one reasonable implementation given
current flash technology: You get something like a log structured
file system with 15 contexts, where each context writes to exactly
one erase block at a given time. This is not all that different
from how eMMC/SD/USB works already without context support, the main
difference being that the context normally gets picked based on the
LBA of the write in segments between 512KB and 16MB. Because the number
of active contexts is smaller than the number of total segments in
the device, the device keeps an LRU list of something between 5 and
30 segments.

Letting the file system pick the context number based on information
it has about the contents rather than the LBA should reduce the amount
of garbage collection if there is a stronger correlation between life
times of data written to the same context than there is between
life times of data written to adjacent LBA numbers.

The trouble with this is of course that getting the file system to
do a really good job at picking the context numbers is a harder
task than coming up with a block allocation scheme that just gets
it right for devices without context ID support ;-).

I think using the inode number is a reasonable fit. Using the
inode number of the parent directory might be more appropriate
but it breaks with hard links and cross-directory renames (we
must not use the same LBA with conflicting context numbers,
or flush the old context inbetween).

	Arnd
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help