Thread (13 messages) 13 messages, 3 authors, 2017-02-13

RE: [PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.

From: Mintz, Yuval <hidden>
Date: 2017-02-12 11:27:20

On Sat, Feb 11, 2017 at 09:58:10AM +0100, Richard Cochran wrote:
quoted
If I am not mistaken, then you can skip the cases val==2 and val==3,
because they are equivalent to val==4 and 6.
I took a stab at this, and you can see the result, below.  My version has
lower average error than yours in the interval 1 < ppb < 60000, and it uses
only 8 64-bit divisions.  Outside of that interval, your version has lower error.

So, at the very least, you should introduce a threshold and use this
algorithm for adjustments under 60 ppm.  Better yet, find a way to use fewer
divisions for adjustments greater and 60 ppm...

Thanks,
Richard

---
#include <stdint.h>

#define TEN9		(1000000000UL)

unsigned int calc_min_integer(uint64_t ppb, uint64_t *M) {
	uint64_t err, m, min, n, N, p2, reg;

	min = TEN9;
	for (n = 4; n <= 7; n++) {
		m = n * TEN9;
		m = (m + ppb/2) / ppb;

		/*truncate to HW resolution*/
		reg = (m + 8) / 16;
		m = reg * 16;

		p2 = (n * TEN9 + m/2) / m;

		err = ppb > p2 ? ppb - p2 : p2 - ppb;

		if (min >= err) {
			min = err;
			N = n;
			*M = m;
		}
	}
	return N;
}
Richard, there are quite a bit of inaccuracies in the calculation here.
I think it's best I'll clarify some things about the used algorithm -

Given a fraction represented by (ppb / 10^9), the purpose of
the loop is to best approximate it by a (val / ((16 * period) + 8)
fraction, where val belongs to {0,..,7} and period to {0,...,0xffffff}.
We'd need 'val' and 'period' from the calculation to configure the HW.

The '+8' is not some sort of rounding correction, but rather part
of the required configuration.
Notice that if we wouldn't have needed it, we could have
reduced some of the iterations [since (val / 16 * period) can be
reduced for powers of two]. But given that we do, skipping those
iteration would cause us to lose accuracy.

Your suggestion seems to:
  a. Assume that the required period should be in ns, not in
      16*ns units.
  b. mishandles the +8/-8 in the calculation.
  c. Doesn't seem to consider the upper bound on period.
In addition, the rounding-up via ppb is likely to harm accuracy for
higher ppb values.

One thing I still don't get is *why* we're trying to optimize this
area of the code -
I ran a variant of the original code in a loop [on requiring 24
divisions per Call] on various PPB values, and managed close
to 1M/s [single CPU, 64bit, 2.50GHz].

Given that the real use-case is somewhere between 1-16/second,
why do we treat it as if its performance is crucial?

Thanks,
Yuval
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help