Thread (44 messages) 44 messages, 7 authors, 2011-01-10

still nfs problems [Was: Linux 2.6.37-rc8]

From: Trond Myklebust <hidden>
Date: 2011-01-07 18:53:32
Also in: linux-arch, linux-nfs, lkml

On Thu, 2011-01-06 at 09:55 -0800, Linus Torvalds wrote: 
On Thu, Jan 6, 2011 at 9:47 AM, Trond Myklebust
[off-list ref] wrote:
quoted
Why is this line needed? We're not writing through the virtual mapping.
I haven't looked at the sequence of accesses, but you need to be
_very_ aware that "write-through" is absolutely NOT sufficient for
cache coherency.

In cache coherency, you have three options:

 - true coherency (eg physically indexed/tagged caches)

 - exclusion (eg virtual caches, but with an exclusion guarantee that
guarantees that aliases cannot happen: either by using physical
tagging or by not allowing cases that could cause virtual aliases)

 - write-through AND non-cached reads (ie "no caching at all").

You seem to be forgetting the "no cached reads" part. It's not
sufficient to flush after a write - you need to make sure that you
also don't have a cached copy of the alias for the read.

So "We're not writing through the virtual mapping" is NOT a sufficient
excuse. If you're reading through the virtual mapping, you need to
make sure that the virtual mapping is flushed _after_ any writes
through any other mapping and _before_ any reads through the virtual
one.
I'm aware of that. That part should be taken care of by the call to
invalidate_kernel_vmap_range() which was in both James and my patch.

There is already code in the SUNRPC layer that calls flush_dcache_page()
after writing (although as Russell pointed out earlier, that is
apparently a no-op for non-page cache pages such as these).
This is why you really really really generally don't want to have
aliasing. Purely virtual caches are pure crap. Really.
Well, it looks as if NOMMU is giving us problems due to the lack of a
vm_map_ram() (see https://bugzilla.kernel.org/show_bug.cgi?id=26262).

I'd still like to keep the existing code for those architectures that
don't have problems, since that allows us to send 32k READDIR requests
instead of being limited to 4k. For large directories, that is a clear
win.
For the NOMMU case we will just go back to using a single page for
storage (and 4k READDIR requests only). Should I just do the same for
architectures like ARM and PARISC?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust at netapp.com
www.netapp.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help