Re: [PATCH 1/2] MIPS: Add barriers between dcache & icache flushes

From: Paul Burton <hidden>
Date: 2016-03-01 02:23:51
Also in: lkml

On Mon, Feb 22, 2016 at 06:39:30PM -0500, Joshua Kinard wrote:

On 02/22/2016 13:09, Paul Burton wrote:

quoted

Index-based cache operations may be arbitrarily reordered by out of
order CPUs. Thus code which writes back the dcache & then invalidates
the icache using indexed cache ops must include a barrier between
operating on the 2 caches in order to prevent the scenario in which:

  - icache invalidation occurs.

  - icache fetch occurs, due to speculation.

  - dcache writeback occurs.

If the above were allowed to happen then the icache would contain stale
data. Forcing the dcache writeback to complete before the icache
invalidation avoids this.

Is there a particular symptom one should look for to check for this issue
occurring?  I haven't seen any odd effects on my SGI systems that appear to
relate to this.  I believe the R1x000 family resolves all hazards in hardware,
so maybe this issue doesn't affect that CPU family?

If not, let me know what to look or test for so I can check the patch out on my
systems.

Thanks!

--J

Hi Joshua,

It depends upon the implementation of the CPU, but the arch spec (MIPS64
BIS, MD00087, revision 6.02) does say:

When implementing multiple level of caches and where the hardware maintains
the smaller cache as a proper subset of a larger cache (every address which is
resident in the smaller cache is also resident in the larger cache; also known
as the inclusion property). It is recommended that the CACHE instructions
which operate on the larger, outer-level cache; must first operate on the
smaller, inner-level cache. For example, a Hit_Writeback _Invalidate operation
targeting the Secondary cache, must first operate on the primary data
cache first. If the CACHE instruction implementation does not follow
this policy then any software which flushes the caches must mimic this
behavior. That is, the software sequences must first operate on the
inner cache then operate on the outer cache. The software must place a
SYNC instruction after the CACHE instruction whenever there are
possible writebacks from the inner cache to ensure that the writeback
data is resident in the outer cache before operating on the outer
cache. If neither the CACHE instruction implementation nor the
software cache flush sequence follow this policy, then the inclusion
property of the caches can be broken, which might be a condition that
the cache management hardware cannot properly deal with.

When implementing multiple level of caches without the inclusion
property, the use of a SYNC instruction after the CACHE instruction is
still needed whenever writeback data has to be resident in the next
level of memory hierarchy.

If data is to transfer from dcache -> L2 -> icache then it has to be
written back to the L2 which would hit that situation of the data
needing "to be resident in the next level of memory hierarchy" after the
dcache. That is guaranteed by the sync instruction:

The CACHE instruction and the memory transactions which are sourced by
the CACHE instruction, such as cache refill or cache writeback, obey
the ordering and completion rules of the SYNC instruction.

This is more something newer cores that reorder more agressively would
be expected to hit, to the best of my knowledge.

Thanks,
    Paul

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help