RE: [PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.

From: Mintz, Yuval <hidden>
Date: 2017-02-12 11:27:20

On Sat, Feb 11, 2017 at 09:58:10AM +0100, Richard Cochran wrote:

quoted

If I am not mistaken, then you can skip the cases val==2 and val==3,
because they are equivalent to val==4 and 6.

I took a stab at this, and you can see the result, below.  My version has
lower average error than yours in the interval 1 < ppb < 60000, and it uses
only 8 64-bit divisions.  Outside of that interval, your version has lower error.

So, at the very least, you should introduce a threshold and use this
algorithm for adjustments under 60 ppm.  Better yet, find a way to use fewer
divisions for adjustments greater and 60 ppm...

Thanks,
Richard

---
#include <stdint.h>

#define TEN9		(1000000000UL)

unsigned int calc_min_integer(uint64_t ppb, uint64_t *M) {
	uint64_t err, m, min, n, N, p2, reg;

	min = TEN9;
	for (n = 4; n <= 7; n++) {
		m = n * TEN9;
		m = (m + ppb/2) / ppb;

		/*truncate to HW resolution*/
		reg = (m + 8) / 16;
		m = reg * 16;

		p2 = (n * TEN9 + m/2) / m;

		err = ppb > p2 ? ppb - p2 : p2 - ppb;

		if (min >= err) {
			min = err;
			N = n;
			*M = m;
		}
	}
	return N;
}

Richard, there are quite a bit of inaccuracies in the calculation here.
I think it's best I'll clarify some things about the used algorithm -

Given a fraction represented by (ppb / 10^9), the purpose of
the loop is to best approximate it by a (val / ((16 * period) + 8)
fraction, where val belongs to {0,..,7} and period to {0,...,0xffffff}.
We'd need 'val' and 'period' from the calculation to configure the HW.

The '+8' is not some sort of rounding correction, but rather part
of the required configuration.
Notice that if we wouldn't have needed it, we could have
reduced some of the iterations [since (val / 16 * period) can be
reduced for powers of two]. But given that we do, skipping those
iteration would cause us to lose accuracy.

Your suggestion seems to:
  a. Assume that the required period should be in ns, not in
      16*ns units.
  b. mishandles the +8/-8 in the calculation.
  c. Doesn't seem to consider the upper bound on period.
In addition, the rounding-up via ppb is likely to harm accuracy for
higher ppb values.

One thing I still don't get is *why* we're trying to optimize this
area of the code -
I ran a variant of the original code in a loop [on requiring 24
divisions per Call] on various PPB values, and managed close
to 1M/s [single CPU, 64bit, 2.50GHz].

Given that the real use-case is somewhere between 1-16/second,
why do we treat it as if its performance is crucial?

Thanks,
Yuval

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help