Thread (29 messages) 29 messages, 7 authors, 2012-06-06

Query about: ARM11 MPCore: preemption/task migration cache coherency

From: catalin.marinas@arm.com (Catalin Marinas)
Date: 2012-05-31 05:19:27

On 31 May 2012 13:06, bill4carson [off-list ref] wrote:
On 2012?05?31? 11:58, Catalin Marinas wrote:
quoted
I still didn't fully understand what the problem is. So, to make sure,
if you run some applications from flash using a yaffs filesystem, you
get random crashes. Is this correct? If yes, a solution is to actually
call flush_dcache_page() on the CPU that does the page copying from
flash into RAM, which could be the yaffs filesystem.
The story goes like this:
function "flush_dcache_page" should be global effective
but in ARMv6 MPCore, it was not, it was just local effective due
to hardware design.
Yes, I know this.
This may cause error in some cases for example:

1) Task running on Core-0 loading text section into memory.
? It was preempted and then migrate into Core-1;
BTW, do you have CONFIG_PREEMPT enabled?

To be clear - is your application reading some data from flash and
trying to execute or it's the kernel doing the load via the
page/prefetch abort mechanism?

If the latter, task running on core 0 gets a prefetch abort when
trying to execute some code. The kernel reads the page from flash (via
mtd, block layer, VFS) and copies it into RAM. It can be on any CPU as
long as it calls flush_dcache_page on the same CPU that copied the
data.

No matter where the task was running or migrated to, if the code doing
the copy also called flush_dcache_page() on the same core, there is no
data left in the D-cache for that page.
2) On Core-1, this task continue loading it and then
? "flush_dcache_page" to make sure the loaded text section write
? into main memory.
The flush_dcache_page() must be called by the code doing the copy. If
that copy happened on core 0, the call is done there and not where the
task migrated. We don't do lazy flushing on ARM11MPCore.
3) Task tend to the loaded text section and running it.

If the "flush_dcache_page" was not global effective,
there maybe data still in Core-0's data cache, not write
into main memory. Thus in step 3, error instruction maybe
fetched thus cause strange error.
This can only happen if you have either preempt enabled (so that the
kernel code doing the copy is migrated) or the mtd driver or fs do not
call flush_dcache_page().

-- 
Catalin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help