Thread (18 messages) 18 messages, 4 authors, 2002-07-16

Re: mips32_flush_cache routine corrupts CP0_STATUS with gcc-2.96

From: Gleb O. Raiko <hidden>
Date: 2002-07-11 13:08:36

Possibly related (same subject, not in this thread)

"Maciej W. Rozycki" wrote:
On Thu, 11 Jul 2002, Gleb O. Raiko wrote:
quoted
I don't wonder if other IDT CPUs also require this, including those that
conform MIPS32.
 Well, for r3k it may seem somewhat justified as cache flushing requires
cache isolation.  But the IDT manual for their whole family of processors
claims the D-cache can function as an I-cache (when swapped; doesn't
apply when not, obviously) and cache flushing can run from KSEG0.

 See "IDT MIPS Microprocessor Family Software Reference Manual", chapter 5
"Cache Management", section "Invalidation":

 "To invalidate the cache in the R30xx:
[...]
 The invalidate routine is normally executed with its instructions
cacheable.  This sounds like a lot of trouble; but in fact shouldnt
require any extra steps to run cached. An invalidation routine in uncached
space will run 4-10 times slower."
Aha, you also stepped on this rake. :-) The problem with IDT manuals
that they frequently contradict itself. You're right, SW manual allows
cached flushes, but hardware manuals for the family prohibit this and
state that flashes must be uncahed.
(a hw manual on family, the same chapter, the same section :-) )

It's not only the place where IDT manuals are wrong. For example, their
wbflush example suggests *(int*)KSEG0 instead *(int*)KSEG1.
quoted
Basically, requirement of uncached run makes hadrware logic much simpler
and allows  to save silicon a bit.
 Why?  I see no dependency.  What's the problem with interleaving cache
fills and invalidations?
There're two possible optimization:
1. (Requires only the instruction that swaps caches must run uncached)
	CPU may skip implementation of double check of cache hit on loads.
	Scenario: mtc0 with cache swapping with ensuring next instructions are
in cache
	(pipelining here!); swap occurs; must check again the instructions are
in 
	the cache because the same cacheline in the data cache may have valid
bit set
	and CPU will get data instead of code.
2. (Requires the whole routine must run uncached)
	CPU may skip check of cache hit on loads from an isolated cache. 

i don't know what optimization IDT made, perhaps, number 3. But, 1. is
really worth to implement.

Regards,
Gleb.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help