Thread (55 messages) 55 messages, 12 authors, 2007-03-22

Re: [RFC] div64_64 support

From: David Miller <davem@davemloft.net>
Date: 2007-03-06 21:58:38
Also in: lkml

From: Stephen Hemminger <redacted>
Date: Tue, 6 Mar 2007 10:29:41 -0800
/* calculate the cubic root of x using Newton-Raphson */
static uint32_t ncubic(uint64_t a)
{
	uint64_t x;

	/* Initial estimate is based on:
	 * cbrt(x) = exp(log(x) / 3)
	 */
	x = 1u << (fls64(a)/3);

	/* Converges in 3 iterations to > 32 bits */

	x = (2 * x + div64_64(a, x*x)) / 3;
	x = (2 * x + div64_64(a, x*x)) / 3;
	x = (2 * x + div64_64(a, x*x)) / 3;

	return x;
}
Indeed that will be the fastest variant for cpus with hw
integer division.

I did a quick sparc64 port, here is what I got:

Function     clocks  mean(us) max(us)  std(us)  total error
ocubic          529     0.35    15.16     0.66 545101
ncubic          498     0.33    12.83     0.36 576263
acbrt           427     0.28    11.04     0.33 547562
hcbrt           393     0.26    10.18     0.47 2410
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help