Thread (44 messages) 44 messages, 7 authors, 2011-01-10

still nfs problems [Was: Linux 2.6.37-rc8]

From: James.Bottomley@HansenPartnership.com (James Bottomley)
Date: 2011-01-06 18:25:47
Also in: linux-arch, linux-nfs, lkml

On Thu, 2011-01-06 at 12:14 -0600, James Bottomley wrote:
On Thu, 2011-01-06 at 18:05 +0000, Russell King - ARM Linux wrote:
quoted
What network DMA operations - what if your NIC doesn't do DMA because
it's an SMSC device?
So this is the danger area ... we might be caught by our own flushing
tricks.  I can't test this on parisc since all my network drivers use
DMA (which automatically coheres the kernel mapping by
flush/invalidate).

What should happen is that the kernel mapping pages go through the
->readdir() path.  Any return from this has to be ready to map the pages
back to user space, so the kernel alias has to be flushed to make the
underlying page up to date.

The exception is pages we haven't yet mapped to userspace.  Here we set
the PG_dcache_dirty bit (sparc trick) but don't flush the page, since we
expect the addition of a userspace mapping will detect this case and do
the flush and clear the bit before the mapping goes live.  I assume
you're thinking that because this page is allocated and freed internally
to NFS, it never gets a userspace mapping and therefore, we can return
from ->readdir() with a dirty kernel cache (and the corresponding flag
set)?  I think that is a possible hypothesis in certain cases.
OK, so thinking about this, it seems that the only danger is actually
what NFS is doing: reading cache pages via a vmap.  In that case, since
the requirement is to invalidate the vmap range to prepare for read, we
could have invalidate_kernel_vmap_range loop over the underlying pages
and flush them through the kernel alias if the architecture specific
flag indicates their contents might be dirty.

The loop adds expense that is probably largely unnecessary to
invalidate_kernel_vmap_range() but the alternative is adding to the API
proliferation with something that only flushes the kernel pages if the
arch specific flag says they're dirty.

James
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help