Thread (4 messages) 4 messages, 2 authors, 2014-09-30

[PATCH v2] i2c: rk3x: adjust the LOW divison based on characteristics of SCL

From: dianders@chromium.org (Doug Anderson)
Date: 2014-09-30 22:23:40
Also in: linux-i2c, lkml

Addy,

On Mon, Sep 29, 2014 at 12:45 AM, addy ke [off-list ref] wrote:
Hi Doug

On 2014/9/29 12:39, Doug Anderson wrote:
quoted
Addy,

On Sat, Sep 27, 2014 at 12:11 AM, Addy Ke [off-list ref] wrote:
quoted
From: Addy <redacted>

As show in I2C specification:
- Standard-mode: the minimum HIGH period of the scl clock is 4.0us
                 the minimum LOW period of the scl clock is 4.7us
- Fast-mode: the minimum HIGH period of the scl clock is 0.6us
             the minimum LOW period of the scl clock is 1.3us

I have measured i2c SCL waveforms in fast-mode by oscilloscope
on rk3288-pinky board. the LOW period of the scl clock is 1.3us.
It is so critical that we must adjust LOW division to increase
the LOW period of the scl clock.

After put this patch, we can find that min_low_ns is about
5.4us in Standard-mode and 1.6us in Fast-mode.

Thanks Doug for the suggestion about division formulas.

Signed-off-by: Addy <redacted>
---
Changes in v2:
- remove Fast-mode plus and HS-mode
- use new formulas suggested by Doug

 drivers/i2c/busses/i2c-rk3x.c | 104 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 97 insertions(+), 7 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
index e637c32..81672a8 100644
--- a/drivers/i2c/busses/i2c-rk3x.c
+++ b/drivers/i2c/busses/i2c-rk3x.c
@@ -428,19 +428,109 @@ out:
        return IRQ_HANDLED;
 }

+static void rk3x_i2c_get_min_ns(unsigned int scl_rate,
+                               unsigned long *min_low_ns,
+                               unsigned long *min_high_ns)
+{
+       WARN_ON(scl_rate > 400000);
+
+       /* As show in I2C specification:
+        * - Standard-mode(100KHz):
+        *   min_low_ns is 4700ns
+        *   min_high_ns is 4000ns
+        * - Fast-mode(400KHz):
+        *   min_low_ns is 1300ns
+        *   min_high_ns is 600ns
+        *
+        * (min_low_ns + min_high_ns) up to 10000ns in Standard-mode
+        * and 2500ns in Fast-mode
+        */
+       if (scl_rate <= 100000) {
+               *min_low_ns = 4700 + 650;
+               *min_high_ns = 4000 + 650;
+       } else {
+               *min_low_ns = 1300 + 300;
+               *min_high_ns = 600 + 300;
Wait, where did the extra 650 and 300 come from here?  Are you just
trying to balance things out?  That's not the right way to do things.
Please explain what you were trying to accomplish and we can figure
out a better way.

...maybe this helped make thins nicer in the clock rates you tried,
but I'd bet that I can find clock rates that this is broken for...
650 = [(1000000000/100K) ns - min_low_ns_100k - min_high_ns_100k] / 2
300 = [(1000000000/300K) ns - min_low_ns_400k - min_high_ns_400k] / 2

if there is no extra 650 and 300:
extra_div is 3 in fast-mode and 11 in standard-mode.
Umm, OK.  Except that "min_low_ns" is no longer the actual minimum
"ns" we need according to the spec.  That will mean that you're
sometimes going to end up with a clock rate that's slower than you
want.

  1)if all of the "extra" is asssigned to the high part, the LOW period is too close to min_low.
    In my test:
    400k, scl_low_ns: 1440ns, scl_high_ns: 1120ns
    100k, scl_low_ns: 4800ns, scl_high_ns: 5120ns
Why does it matter if it's close to "min_low"?  It's above the minimum
so it's fine, right?  If you need a buffer, add an explicit buffer to
the calculations.

  2)if not, dato hold time is too close to min_hold_time(400khz, 900ns)
    In my test:
    400k, scl_low_ns: 1760ns, scl_high_ns: 800ns
    100k, scl_low_ns: 5304ns, scl_high_ns: 4368ns

    hold_time_ns ~= scl_low_ns / 2 = 880ns
It's still under, though.  Why does this matter?  Again, if you need a
buffer then add a buffer.

So I think we must decrease min_low_ns:min_high_ns to decrease scl_low in case 2

So I set the extra 650 and 300.

After this, in my test:
   400k, scl_low_ns: 1600ns, scl_high_ns: 960ns
   100k, scl_low_ns: 5200ns, scl_hign_ns: 4576ns

I think this value is more reasonable.
You're testing with a very particular clock input rate.  The function
needs to be general and work across a lot of different input clock
rates.  That's why my python code has loops in it to test the results
over a huge variety of input clock rates...

---

Here are a few examples of differences (SEP 30 is my new proposal below):

With this example your code produces a slightly slower clock rate than
desirable:

SEP 30: CLK = 74250000, Req = 100000, act = 99798.39, 5.49 us low,
4.53 us high, low/high = 1.21 (50 41)
AddyV2: CLK = 74250000, Req = 100000, act = 98736.70, 5.39 us low,
4.74 us high, low/high = 1.14 (49 43)

Here's another example where your patch doesn't produce ideal timings
(in this case the difference is more pronounced)

SEP 30: CLK = 66380000, Req = 400000, act = 395119.05, 1.69 us low,
0.84 us high, low/high = 2.00 (13 6)
AddyV2: CLK = 66380000, Req = 400000, act = 377159.09, 1.69 us low,
0.96 us high, low/high = 1.75 (13 7)

Here's an example your code violates timings (true, a 1.61MHz clock
isn't super realistic in rk3288, but it does show a non-ideal case):

SEP 30: CLK = 1610000, Req = 100000, act = 67083.33, 4.97 us low, 9.94
us high, low/high = 0.50 (0 1)
AddyV2: CLK = 1610000, Req = 100000, act = 67083.33, 9.94 us low, 4.97
us high, low/high = 2.00 (1 0)
...ERROR: low too long 9.94 us > 6.85 us (/2 4.97 us > 3.42 us) (conflicting=0)

---

I'll paste new code here with a proposal that adds a buffer (could be
0) and also send you an attachment in a separate email (I can't quite
believe that the lists will handle attachments).

One thing you'll notice is that with sufficiently slow input clocks
it's possible to get into a state where we can't meet both the minimum
and maximum times for SCL being low.  This probably isn't super
realistic, but the code could certainly have a warning.

---

def DIV_ROUND_UP(a, b): return (a + b - 1) // b

# This is bassed on v2 sent by Addy (fixed DIV_ROUND_UP)
def test_it_addy_v2(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate):
  # Add the buffer in suggested by addy
  if min_low_ns == 4700:
    min_low_ns += 650
    min_high_ns += 650
  else:
    min_low_ns += 300
    min_high_ns += 300

  min_total_ns = min_low_ns + min_high_ns

  # Adjust to avoid overflow
  i2c_rate_khz = DIV_ROUND_UP(i2c_rate, 1000)
  scl_rate_khz = DIV_ROUND_UP(scl_rate, 1000)

  # We need the total div to be >= this number so we don't clock too fast.
  min_total_div = DIV_ROUND_UP(i2c_rate_khz, scl_rate_khz * 8);

  # These are the min dividers needed for min hold times.
  min_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, 8 * 1000000)
  min_high_div = DIV_ROUND_UP(i2c_rate_khz * min_high_ns, 8 * 1000000)
  min_div_for_hold = (min_low_div + min_high_div)

  if min_div_for_hold > min_total_div:
    # Time needed to meet hold requirements is important.  Just use that
    div_low = min_low_div
    div_high = min_high_div
  else:
    # We've got to distribute some time among the low and high so we
    # don't run too fast.
    extra_div = min_total_div - min_div_for_hold

    # We'll try to split things up perfectly evenly, biasing slightly
    # towards having a higher div for low (spend more time low).
    ideal_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns,
                                 scl_rate_khz * 8 * min_total_ns)

    # Handle when the ideal low div is going to take up more than we have
    if ideal_low_div > min_low_div + extra_div:
      assert ideal_low_div == min_low_div + extra_div + 1
      ideal_low_div = min_low_div + extra_div

    # Give low the "ideal" and give high whatever extra is left.
    div_low = ideal_low_div
    div_high = min_high_div + (extra_div - (ideal_low_div - min_low_div))

  # Adjust to the fact that the hardware has an implicit "+1".
  # NOTE: Above calculations always produce div_low > 0 and  div_high > 0.
  div_low -= 1
  div_high -= 1

  return div_low, div_high, False # Note: we don't calculate "conflicting"

# test_it_sep30 - Sept 30th proposal for i2c stuff
#
# min_low_ns:  The minimum number of ns we need to hold low to meet i2c spec
# min_high_ns: The minimum number of ns we need to hold high to meet i2c spec
# max_low_ns:  The maximum number of ns we can hold low to meet i2c spec
#
# Note: max_low_ns should be (max data hold time * 2 - buffer)
#       This is because the i2c host on Rockchip holds the data line for
#       half the low time.
def test_it_sep30(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate):
  min_total_ns = min_low_ns + min_high_ns

  # Adjust to avoid overflow
  i2c_rate_khz = DIV_ROUND_UP(i2c_rate, 1000)
  scl_rate_khz = DIV_ROUND_UP(scl_rate, 1000)

  # We need the total div to be >= this number so we don't clock too fast.
  min_total_div = DIV_ROUND_UP(i2c_rate_khz, scl_rate_khz * 8);

  # These are the min dividers needed for min hold times.
  min_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, 8 * 1000000)
  min_high_div = DIV_ROUND_UP(i2c_rate_khz * min_high_ns, 8 * 1000000)
  min_div_for_hold = (min_low_div + min_high_div)

  # This is the maximum divider so we don't go over the max.  We don't round up
  # here (we round down) since this is a max.
  max_low_div = i2c_rate_khz * max_low_ns / (8 * 1000000)

  # Giving different rounding, it's possible that min > max (!?!)  We're
  # totally out of luck in that case, so let's go with getting clocks
  # right I guess...
  if min_low_div > max_low_div:
    # print "WARNING: conflicting restrictions"
    conflicting = True
    max_low_div = min_low_div
  else:
    conflicting = False

  if min_div_for_hold > min_total_div:
    # Time needed to meet hold requirements is important.  Just use that
    div_low = min_low_div
    div_high = min_high_div
  else:
    # We've got to distribute some time among the low and high so we
    # don't run too fast.
    extra_div = min_total_div - min_div_for_hold

    # We'll try to split things up perfectly evenly, biasing slightly
    # towards having a higher div for low (spend more time low).
    ideal_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns,
                                 scl_rate_khz * 8 * min_total_ns)

    # Don't allow it to go over the max
    if ideal_low_div > max_low_div:
      ideal_low_div = max_low_div

    # Handle when the ideal low div is going to take up more than we have
    if ideal_low_div > min_low_div + extra_div:
      assert ideal_low_div == min_low_div + extra_div + 1
      ideal_low_div = min_low_div + extra_div

    # Give low the "ideal" and give high whatever extra is left.
    div_low = ideal_low_div
    div_high = min_high_div + (extra_div - (ideal_low_div - min_low_div))

  # Adjust to the fact that the hardware has an implicit "+1".
  # NOTE: Above calculations always produce div_low > 0 and  div_high > 0.
  div_low -= 1
  div_high -= 1

  return div_low, div_high, conflicting

def print_it(label, min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate,
             div_low, div_high, conflicting):
  T_pclk_us = 1000000. / i2c_rate
  T_sclk_us = 1000000. / scl_rate

  T_low_us = T_pclk_us * (div_low + 1) * 8
  T_high_us = T_pclk_us * (div_high + 1) * 8

  T_tot_us = (T_high_us + T_low_us)
  freq = 1000000. / T_tot_us

  print "%s: CLK = %d, Req = %d, act = %.2f, %.2f us low, " \
        "%.2f us high, low/high = %.2f (%d %d)" % (
        label,
        i2c_rate, scl_rate, freq, T_low_us,
        T_high_us, T_low_us / T_high_us, div_low, div_high)

  if T_low_us * 1000 < min_low_ns:
    print "...ERROR: not low long enough"
  if T_high_us * 1000 < min_high_ns:
    print "...ERROR: not high long enough"

  if T_low_us * 1000 > max_low_ns:
    print "...ERROR: low too long %.2f us > %.2f us " \
      "(/2 %.2f us > %.2f us) (conflicting=%d)" % (
      T_low_us, max_low_ns / 1000.,
      T_low_us / 2., max_low_ns / 2000.,
      conflicting)
  elif conflicting:
    print "...Conflicting, but not low too long?"

  return (i2c_rate, scl_rate, freq, T_low_us, T_high_us)


def test_it(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate):
  sep30 = test_it_sep30(min_low_ns, min_high_ns, max_low_ns,
                        i2c_rate, scl_rate)
  addy_v2 = test_it_addy_v2(min_low_ns, min_high_ns, max_low_ns,
                            i2c_rate, scl_rate)

  print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns,
           i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2])
  print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns,
           i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2])

  # Test to see where results are differet
  #if (sep30[0], sep30[1]) != (addy_v2[0], addy_v2[1]):
  #  print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns,
  #           i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2])
  #  print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns,
  #           i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2])

  # Test to see where clock rates are different.
  #if (sep30[0] + sep30[1]) != (addy_v2[0] + addy_v2[1]):
  #  print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns,
  #           i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2])
  #  print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns,
  #           i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2])


min_low_std_ns = 4700
min_high_std_ns = 4000
max_data_hold_std_ns = 3450
data_hold_buffer_std_ns = 50

min_low_fast_ns =  1300
min_high_fast_ns = 600
max_data_hold_fast_ns = 900
data_hold_buffer_fast_ns = 50

test_it(min_low_std_ns, min_high_std_ns,
        max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, 74250000, 100000)
test_it(min_low_std_ns, min_high_std_ns,
        max_data_hold_std_ns * 2 - data_hold_buffer_std_ns,  2000001, 100000)
test_it(min_low_std_ns, min_high_std_ns,
        max_data_hold_std_ns * 2 - data_hold_buffer_std_ns,  1610000, 100000)
test_it(min_low_std_ns, min_high_std_ns,
        max_data_hold_std_ns * 2 - data_hold_buffer_std_ns,  1484000, 100000)

test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 66380000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 74250000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 12800000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,  9400000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,  6400000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,  5000000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,  3200000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,  1600000, 400000)
test_it(min_low_fast_ns, min_high_fast_ns,
        max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns,   800000, 400000)

#for i in xrange(74250000, 800000, -100):
#  test_it(min_low_std_ns, min_high_std_ns,
#          max_data_hold_std_ns * 2 - data_hold_buffer_std_ns,  i, 100000)
#
#for i in xrange(74250000, 800000, -100):
#  test_it(min_low_fast_ns, min_high_fast_ns,
#          max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, i, 400000)

---

-Doug
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help