Thread (12 messages) 12 messages, 3 authors, 2022-05-02

Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: 2022-05-02 07:33:22
Also in: linux-can, lkml

On 29.04.2022 23:31:28, Pavel Pisa wrote:
quoted
Split into separate patches and applied.
Excuse me for late reply and thanks much for split to preferred
form. Matej Vasilevski has tested updated linux-can-next testing
on Xilinx Zynq 7000 based MZ_APO board and used it with his
patches to do proceed next round of testing of Jan Charvat's NuttX
TWAI (CAN) driver on ESP32C3. We plan that CTU CAN FD timestamping
will be send for RFC/discussion soon.
Sounds good!
I would like to thank to Andrew Dennison who implemented, tested
and shares integration with LiteX and RISC-V

  https://github.com/litex-hub/linux-on-litex-vexriscv

He uses development version of the CTU CAN FD IP core with configurable
number of Tx buffers (2 to 8) for which will be required
automatic setup logic in the driver.

I need to discuss with Ondrej Ille actual state and his plans.
But basically ntxbufs in the ctucan_probe_common() has to be assigned
from TXTB_INFO TXT_BUFFER_COUNT field. For older core version
the TXT_BUFFER_COUNT field bits should be equal to zero so when
value is zero, the original version with fixed 4 buffers will
be recognized.
Makes sense
When value is configurable then for (uncommon) number
of buffers which is not power of two, there will be likely
a problem with way how buffers queue is implemented

  txtb_id = priv->txb_head % priv->ntxbufs;
  ...
  priv->txb_head++;
  ...
  priv->txb_tail++;

When I have provided example for this type of queue many years
ago I have probably shown example with power of 2 masking,
but modulo by arbitrary number does not work with sequence
overflow. Which means to add there two "if"s unfortunately

  if (++priv->txb_tail == 2 * priv->ntxbufs)
      priv->txb_tail = 0;
There's another way to implement this, here for ring->obj_num being
power of 2:

| static inline u8 mcp251xfd_get_tx_head(const struct mcp251xfd_tx_ring *ring)
| {
| 	return ring->head & (ring->obj_num - 1);
| }
| 
| static inline u8 mcp251xfd_get_tx_tail(const struct mcp251xfd_tx_ring *ring)
| {
| 	return ring->tail & (ring->obj_num - 1);
| }
| 
| static inline u8 mcp251xfd_get_tx_free(const struct mcp251xfd_tx_ring *ring)
| {
| 	return ring->obj_num - (ring->head - ring->tail);
| }

If you want to allow not power of 2 ring->obj_num, use "% ring->obj_num"
instead of "& (ring->obj_num - 1)".

I'm not sure of there is a real world benefit (only gut feeling, should
be measured) of using more than 4, but less than 8 TX buffers.

You can make use of more TX buffers, if you implement (fully hardware
based) TX IRQ coalescing (== handle more than one TX complete interrupt
at a time) like in the mcp251xfd driver, or BQL support (== send more
than one TX CAN frame at a time). I've played a bit with BQL support on
the mcp251xfd driver (which is attached by SPI), but with mixed results.
Probably an issue with proper configuration.
We need 2 * priv->ntxbufs range to distinguish empty and full queue...
But modulo is not nice either so I probably come with some other
solution in a longer term. In the long term, I want to implement
virtual queues to allow multiqueue to use dynamic Tx priority
of up to 8 the buffers...
ACK, multiqueue TX support would be nice for things like the Earliest TX
Time First scheduler (ETF). 1 TX queue for ETF, the other for bulk
messages.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help