Thread (23 messages) 23 messages, 4 authors, 2012-04-19

Re: PowerPC radeon KMS - is it possible?

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2012-04-18 11:18:10

Possibly related (same subject, not in this thread)

On Wed, 2012-04-18 at 12:34 +0200, Michel Dänzer wrote:
On Mit, 2012-04-18 at 20:20 +1000, Benjamin Herrenschmidt wrote: 
quoted
On Wed, 2012-04-18 at 10:02 +0200, Michel Dänzer wrote:
quoted
quoted
GPU lockup appears to be a common problem with the radeon driver.
It's what happens when anything goes wrong with the GPU. If it doesn't
happen with agpmode=-1, it's probably an AGP related coherency issue. 
I had some success hacking the DRM to do an in_le32 from the ring head
after writing it. Just a gross hack but it seemed to help on a G5.
AFAICT radeon_ring_commit() does that already:

        DRM_MEMORYBARRIER();
        WREG32(ring->wptr_reg, (ring->wptr << ring->ptr_reg_shift) & ring->ptr_reg_mask);
        (void)RREG32(ring->wptr_reg);

We added the readback about a decade ago. :)
Hrm, I have a different hack in that old tree I was playing with a while
back, let me see...
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -2245,6 +2245,9 @@ void radeon_commit_ring(drm_radeon_private_t
*dev_priv)
        DRM_MEMORYBARRIER();
        GET_RING_HEAD( dev_priv );
 
+#ifdef CONFIG_PPC
+       in_be32(dev_priv->ring.start);
+#endif
        if ((dev_priv->flags & RADEON_FAMILY_MASK) >= CHIP_R600) {


I think that my rational was to ensure that all previous stores to
AGP (indirect buffers etc...) were pushed out & ordered vs the ring
wptr update or something like that, bcs I think those path aren't well
ordered in HW. In fact I suspect we might even need a bigger hammer than
that in_be32().

Another hack I had around was removing the SBA reset from agp-uninorth
completely on binding new pages, it seemed to cause hangs.
quoted
I suspect there's a fundamental design issue with apple bridge in that
the CPU to memory path isn't coherent at all with the GPU to memory path
ie. even vs. cache flush instructions (ie buffers in the memory
controllers can still be out of sync).

Darwin does some gross hacks to work around that, some of them visible
in the AGP drivers, some burried in the Apple driver, I don't know for
sure. It's possible that they end up mapping all AGP memory as cache
inhibited, but we can't do that because of our linear mapping.
We are doing that though...
Are we really ? I thought we were taking existing cachable RAM objects
and mapping them into the AGP gart. Are we replacing both kernel & user
mappings for those objects with an equivalent cache inhibited mapping ?

I'm not -that- familiar with how ttm works here. In any case it can
cause bus checkstops because the same pages can be prefetched into the
cache via the linear mapping which is covered by BATs (unless you make
your graphic objects HIGHMEM only but good luck with that :-)

To make that work reliably we should disable the BAT mapping so the
linear mapping can then be controlled on a per-page basis (on 32-bit)
and this is complicated .... we have code that more/less relies on the
BAT mapping being there elsewhere. On 64-bit it's even nastier because
we use 16M pages for the linear mapping.

Cheers,
Ben.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help