Thread (98 messages) 98 messages, 11 authors, 2016-08-27

Re: [RFC PATCH 0/3] UART slave device bus

From: H. Nikolaus Schaller <hidden>
Date: 2016-08-21 07:51:10
Also in: linux-bluetooth, lkml

Am 20.08.2016 um 15:22 schrieb One Thousand Gnomes [off-list ref]:

On Fri, 19 Aug 2016 19:42:37 +0200
"H. Nikolaus Schaller" [off-list ref] wrote:
quoted
quoted
Am 19.08.2016 um 13:06 schrieb One Thousand Gnomes [off-list ref]:
quoted
If possible, please do a callback for every character that arrives.
And not only if the rx buffer becomes full, to give the slave driver
a chance to trigger actions almost immediately after every character.
This probably runs in interrupt context and can happen often.  
We don't realistically have the clock cycles to do that on a low end
embedded processor handling high speed I/O.  
well, if we have a low end embedded processor and high-speed I/O, then
buffering the data before processing doesn't help either since processing
still will eat up clock cycles.
Of course it helps. You are out of the IRQ handler within the 9 serial
clocks, so you can take another interrupt and grab the next byte. You
will also get benefits from processing the bytes further in blocks,
if there are benefits from processing blocks. That depends on the specific
protocol.

My proposal can still check and then place byte by byte in a buffer and almost
immediately return from interrupt. Until a block is completed and then trigger
processing outside of the interrupt context.
and if you get too far behind you'll make the flow control limit.

You've also usually got multiple cores these days - although not on the
very low end quite often.
Indeed. But low-end rarely has really high-speed requirements and then should
also run Linux. If it goes to performance limits, probably some assembler code
will be used.

And UART is inherently slow compared to SPI or USB or Ethernet.
quoted
The question is if this is needed at all. If we have a bluetooth stack with HCI the
fastest UART interface I am aware of is running at 3 Mbit/s. 10 bits incl. framing
means 300kByte/s equiv. 3µs per byte to process. Should be enough to decide
if the byte should go to a buffer or not, check checksums, or discard and move
the protocol engine to a different state. This is what I assume would be done in
a callback. No processing needing some ms per frame.
That depends on the processor - remember people run Linux on low end CPUs
including those embedded in an FPGA not just high end PC and ARM class
devices.

The more important question is - purely for the receive side of things -
is a callback which guarantees to be called "soon" after the bytes arrive
sufficient.

If it is then almost no work is needed on the receive side to allow pure
kernel code to manage recevied data directly because the current
buffering support throughout the receive side is completely capable of
providing those services without a tty structure, and to anything which
can have a tty attached.
Let me ask a question about your centralized and pre-cooked buffering approach.

As far as I see, even then the kernel API must notify the driver at the right moment
that a new block has arrived. Right?

But how does the kernel API know how long such a block is?

Usually there is a start byte/character, sometimes a length indicator, then payload data,
some checksum and finally a stop byte/character. For NMEA it is $, no length, * and \r\n.
For other serial protocols it might be AT, no length, and \r. Or something different.
HCI seems to use 2 byte op-code or 1 byte event code and 1 byte parameter length.

So this means each protocol has a different block format.

How can centralized solution manage such differently formatted blocks?

IMHO it can't without help from the device specific slave device driver. Which must
therefore be able to see every byte to decide into which category it goes. Which brings
us back to the every-byte-interrupt-context callback.

This is different from well formatted protocols like SPI or I2C or Ethernet etc.
where the controller decodes the frame boundaries and DMA can store the
payload data and an interrupt occurs for every received block.

So I would even conclude that you usually can't even use DMA based UART receive
processing for arbitrary and not well-defined protocols. Or have to assume that the
protocol is 100% request-response based and a timeout can tell that no more data
will be received - until a new request has been sent.
Doesn't solve transmit or configuration but it's one step that needs no
additional real work and re-invention.

Alan
BR,
Nikolaus
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help