Re: [PATCH v8 2/3] CMDQ: Mediatek CMDQ driver
From: Matthias Brugger <hidden>
Date: 2016-06-08 10:46:04
Also in:
linux-arm-kernel, linux-mediatek, lkml
On 08/06/16 07:40, Horng-Shyang Liao wrote:
Hi Matthias, On Tue, 2016-06-07 at 18:59 +0200, Matthias Brugger wrote:quoted
On 03/06/16 15:11, Matthias Brugger wrote:quoted
[...]quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
+ + smp_mb(); /* modify jump before enable thread */ + } + + cmdq_thread_writel(thread, task->pa_base + task->command_size, + CMDQ_THR_END_ADDR); + cmdq_thread_resume(thread); + } + list_move_tail(&task->list_entry, &thread->task_busy_list); + spin_unlock_irqrestore(&cmdq->exec_lock, flags); +} + +static void cmdq_handle_error_done(struct cmdq *cmdq, + struct cmdq_thread *thread, u32 irq_flag) +{ + struct cmdq_task *task, *tmp, *curr_task = NULL; + u32 curr_pa; + struct cmdq_cb_data cmdq_cb_data; + bool err; + + if (irq_flag & CMDQ_THR_IRQ_ERROR) + err = true; + else if (irq_flag & CMDQ_THR_IRQ_DONE) + err = false; + else + return; + + curr_pa = cmdq_thread_readl(thread, CMDQ_THR_CURR_ADDR); + + list_for_each_entry_safe(task, tmp, &thread->task_busy_list, + list_entry) { + if (curr_pa >= task->pa_base && + curr_pa < (task->pa_base + task->command_size))What are you checking here? It seems as if you make some implcit assumptions about pa_base and the order of execution of commands in the thread. Is it save to do so? Does dma_alloc_coherent give any guarantees about dma_handle?1. Check what is the current running task in this GCE thread. 2. Yes. 3. Yes, CMDQ doesn't use iommu, so physical address is continuous.Yes, physical addresses might be continous, but AFAIK there is no guarantee that the dma_handle address is steadily growing, when calling dma_alloc_coherent. And if I understand the code correctly, you use this assumption to decide if the task picked from task_busy_list is currently executing. So I think this mecanism is not working.I don't use dma_handle address, and just use physical addresses. From CPU's point of view, tasks are linked by the busy list. From GCE's point of view, tasks are linked by the JUMP command.quoted
In which cases does the HW thread raise an interrupt. In case of error. When does CMDQ_THR_IRQ_DONE get raised?GCE will raise interrupt if any task is done or error. However, GCE is fast, so CPU may get multiple done tasks when it is running ISR. In case of error, that GCE thread will pause and raise interrupt. So, CPU may get multiple done tasks and one error task.I think we should reimplement the ISR mechanism. Can't we just read CURR_IRQ_STATUS and THR_IRQ_STATUS in the handler and leave cmdq_handle_error_done to the thread_fn? You will need to pass information from the handler to thread_fn, but that shouldn't be an issue. AFAIK interrupts are disabled in the handler, so we should stay there as short as possible. Traversing task_busy_list is expensive, so we need to do it in a thread context.Actually, our initial implementation is similar to your suggestion, but display needs CMDQ to return callback function very precisely, else display will drop frame. For display, CMDQ interrupt will be raised every 16 ~ 17 ms, and CMDQ needs to call callback function in ISR. If we defer callback to workqueue, the time interval may be larger than 32 ms.sometimes.I think the problem is, that you implemented the workqueue as a ordered workqueue, so there is no parallel processing. I'm still not sure why you need the workqueue to be ordered. Can you please explain.The order should be kept. Let me use mouse cursor as an example. If task 1 means move mouse cursor to point A, task 2 means point B, and task 3 means point C, our expected result is A -> B -> C. If the order is not kept, the result could become A -> C -> B.Got it, thanks for the clarification.I think a way to get rid of the workqueue is to use a timer, which gets programmed to the time a timeout in the first task in the busy list would happen. Everytime we update the busy list (e.g. because of task got finished by the thread), we update the timer. When the timer triggers, which hopefully won't happen too often, we return timeout on the busy list elements, until the time is lower then the actual time. At least with this we can reduce the data structures in this driver and make it more lightweight.From my understanding, your proposed method can handle timeout case. However, the workqueue is also in charge of releasing tasks. Do you take releasing tasks into consideration by using the proposed timer method? Furthermore, I think the code will become more complex if we also use timer to implement releasing tasks.
Can't we call
clk_disable_unprepare(cmdq->clock);
cmdq_task_release(task);
after invoking the callback?
Regrading the clock, wouldn't it be easier to handle the clock
enable/disable depending on the state of task_busy_list? I suppose we
can't as we would need to check the task_busy_list of all threads, right?
Regards,
Matthias
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html