Thread (11 messages) 11 messages, 3 authors, 2022-03-14

Re: [PATCH] HID: intel-ish-hid: Use dma_alloc_coherent for firmware update

From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
Date: 2022-02-10 19:39:43
Also in: stable

On Thu, 2022-02-10 at 10:34 -0800, Gwendal Grignou wrote:
On Wed, Feb 9, 2022 at 6:51 PM srinivas pandruvada
[off-list ref] wrote:
quoted
On Wed, 2022-02-09 at 16:59 -0800, Gwendal Grignou wrote:
quoted
On Wed, Feb 9, 2022 at 10:52 AM srinivas pandruvada
[off-list ref] wrote:
quoted
On Tue, 2022-02-08 at 21:09 -0800, Gwendal Grignou wrote:
quoted
Allocating memory with kmalloc and GPF_DMA32 is not allowed,
the
allocator will ignore the attribute.
Looks like there is new error flow added here for this flag.
Is this just removing GFP_DMA32 not enough?
It was already ignored. It is not enough and I don't know why
since
the virtual and physical addresses are in the same range:

With using kmalloc/dma_single_sync:
5.10 (firmware not loading)
[    3.837996] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
kmalloc/dma_map_single: v:0xffff91531ae88000,
phy:0x0000000085afe000
[    3.838003] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
xfer_mode=dma offset=0x00000000 size=0x4000 is_last=0
ddr_phys_addr=0x0000000085afe000
...

4.19 (firmware loading)
[    3.878300] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
kmalloc/dma_map_single: v:0xffff88c145480000,
phy:0x0000000085480000
[    3.878322] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
xfer_mode=dma offset=0x00000000 size=0x4000 is_last=0
ddr_phys_addr=0x0000000085480000
...

I also considered flushing the CPU cache before the
dma_sync_single_for_device call, and calling
dma_sync_single_for_cpu()
after loader_cl_send() to be allowed to write into the buffer
again.
But these did not help either.

But using dma_alloc_coherent() clearly works and its simpler API
makes
That is OK.
quoted
it more appropriate for the task at hand.

For reference, the same log when using dma_alloc_coherent().
[    3.779723] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
dma_alloc_coherent: v:0xffff9beb81048000, phy:0x0000000001048000
[    3.779731] ish-loader {C804D06A-55BD-4EA7-ADED-1E31228C76DC}:
xfer_mode=dma offset=0x00000000 size=0x4000 is_last=0
ddr_phys_addr=0x0000000001048000
...
quoted
quoted
Instead, use dma_alloc_coherent() API as we allocate a small
amount
of
memory to transfer firmware fragment to the ISH.

On Arcada chromebook, after the patch the warning:
"Unexpected gfp: 0x4 (GFP_DMA32). Fixing up to gfp: 0xcc0
(GFP_KERNEL).  Fix your code!"
is gone. The ISH firmware is loaded properly and we can
interact
with
the ISH:
quoted
ectool  --name cros_ish version
...
Build info:    arcada_ish_v2.0.3661+3c1a1c1ae0 2022-02-08
05:37:47
@localhost
Tool version:  v2.0.12300-900b03ec7f 2022-02-08 10:01:48
@localhost

Fixes: commit 91b228107da3 ("HID: intel-ish-hid: ISH firmware
loader
client driver")
Signed-off-by: Gwendal Grignou <redacted>
Cc: stable@vger.kernel.org
---
 drivers/hid/intel-ish-hid/ishtp-fw-loader.c | 29 +++--------
----
----
--
 1 file changed, 3 insertions(+), 26 deletions(-)
diff --git a/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
index e24988586710..16aa030af845 100644
--- a/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
+++ b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
@@ -661,21 +661,12 @@ static int
ish_fw_xfer_direct_dma(struct
ishtp_cl_data *client_data,
         */
        payload_max_size &= ~(L1_CACHE_BYTES - 1);

-       dma_buf = kmalloc(payload_max_size, GFP_KERNEL |
GFP_DMA32);
+       dma_buf = dma_alloc_coherent(devc, payload_max_size,
&dma_buf_phy, GFP_KERNEL);
        if (!dma_buf) {
                client_data->flag_retry = true;
                return -ENOMEM;
        }

-       dma_buf_phy = dma_map_single(devc, dma_buf,
payload_max_size,
-                                    DMA_TO_DEVICE);
-       if (dma_mapping_error(devc, dma_buf_phy)) {
-               dev_err(cl_data_to_dev(client_data), "DMA map
failed\n");
-               client_data->flag_retry = true;
-               rv = -ENOMEM;
-               goto end_err_dma_buf_release;
-       }
-
        ldr_xfer_dma_frag.fragment.hdr.command =
LOADER_CMD_XFER_FRAGMENT;
        ldr_xfer_dma_frag.fragment.xfer_mode =
LOADER_XFER_MODE_DIRECT_DMA;
        ldr_xfer_dma_frag.ddr_phys_addr = (u64)dma_buf_phy;
@@ -695,14 +686,7 @@ static int ish_fw_xfer_direct_dma(struct
ishtp_cl_data *client_data,
                ldr_xfer_dma_frag.fragment.size =
fragment_size;
                memcpy(dma_buf, &fw->data[fragment_offset],
fragment_size);

-               dma_sync_single_for_device(devc, dma_buf_phy,
-                                          payload_max_size,
-                                          DMA_TO_DEVICE);
-
Any reason for removal of sync?
It is not needed since we are using dma_alloc_coherent(). From
[1]:
"""
void *
dma_alloc_coherent(struct device *dev, size_t size,
   dma_addr_t *dma_handle, gfp_t flag)

Consistent memory is memory for which a write by either the
device or
the processor can immediately be read by the processor or device
without having to worry about caching effects.  (You may however
need
to make sure to flush the processor's write buffers before
telling
devices to read that memory.)

This routine allocates a region of <size> bytes of consistent
memory.
"""'
 I checked with some dma folks. This call may still be required for
some device. Most likely not as this is on chip device.
What happens if you leave this call? I worry for regression on some
old
systems.
I truly believe this call is unnecessary: From the docs, dma_sync_...
is only for streaming DMA mappings, not for coherent memory. Another
link here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/341712.html
Looking at older drivers, for instance exec_drive_command() in
driver/block/mtip32xx/mtip32xx.c, there are no dma_sync_ between
dma_alloc_ and dma_free_.
In a simple driver - drivers/scsi/advansys.c - that only uses
coherent
DMA memory, no dma_sync calls are used.
As long as we have a memory barrier before sending the packet
downstream, we are covered. I rebooted my machine in a loop without
error overnight.
I think sync may not be required here. But 
there is an additional step you miss if you don't call sync, 
irrespective of "(!dev_is_dma_coherent(dev))" in the call chain of this
sync.

	if (unlikely(is_swiotlb_buffer(dev, paddr)))
		swiotlb_sync_single_for_device(dev, paddr, size, dir);

So when SWIOTLB is enabled with swiotlb=force and bypass IOMMU, I don't
know the net affect of this.
So I need some further debug on this.

Thanks,
Srinivas
Gwendal.
quoted
Thanks,
Srinivas


quoted
quoted
Thanks,
Srinivas
quoted
-               /*
-                * Flush cache here because the
dma_sync_single_for_device()
-                * does not do for x86.
-                */
+               /* Flush cache to be sure the data is in main
memory.
*/
                clflush_cache_range(dma_buf,
payload_max_size);

                dev_dbg(cl_data_to_dev(client_data),
@@ -725,15 +709,8 @@ static int ish_fw_xfer_direct_dma(struct
ishtp_cl_data *client_data,
                fragment_offset += fragment_size;
        }

-       dma_unmap_single(devc, dma_buf_phy, payload_max_size,
DMA_TO_DEVICE);
-       kfree(dma_buf);
-       return 0;
-
 end_err_resp_buf_release:
-       /* Free ISH buffer if not done already, in error case
*/
-       dma_unmap_single(devc, dma_buf_phy, payload_max_size,
DMA_TO_DEVICE);
-end_err_dma_buf_release:
-       kfree(dma_buf);
+       dma_free_coherent(devc, payload_max_size, dma_buf,
dma_buf_phy);
        return rv;
 }
[1]https://www.kernel.org/doc/Documentation/DMA-API.txt
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help