Re: PPC bn_div_words routine rewrite
From: David Ho <hidden>
Date: 2005-07-05 20:21:12
Possibly related (same subject, not in this thread)
- 2005-07-05 · Re: PPC bn_div_words routine rewrite · Andy Polyakov <hidden>
- 2005-07-04 · Re: PPC bn_div_words routine rewrite · David Ho <hidden>
- 2005-07-01 · Re: PPC bn_div_words routine rewrite · Andy Polyakov <hidden>
- 2005-06-30 · Re: PPC bn_div_words routine rewrite · David Ho <hidden>
Let's take first call to BN_div_word for example from BN_bn2dec, the parameter being passed to BN_div_word is (a=3D35, w=3D1000000000) (decimal numbers). It then calls the bn_div_words with (h=3D0, l=3D35, d=3D1000000000) if you examine the code in linux_ppc32.s it will exit early on because h is 0. the routine returns a divide by 0, which is undefined according to the manual. In the case of ppc8xx the result is 0x80000000. So this is the return value from bn_div_words, as seen in register R3. So what happens next is BN_div_word modifies "a" (1st parameter) with the result (0x80000000) and returns 23 as the remainder of the division. So "a" is never zero as a result and hence the test for BN_is_zero is always false. The problem fails the very first time it uses bn_div_words. The next thing I did naturally was to fix the case when you have h=3D0, which you can quite easy do it with the native divwu instruction. Lo and behold I was once again disappointed when h is not equal to 0. More to come... On 7/5/05, David Ho [off-list ref] wrote:
I can tell you with certainty, with reference to the function BN_bn2dec, that since lp is a pointer, and within the while loop around bn_print.c:136 lp is being incremented. Because the test BN_is_zero(t) is always false, you have a pointer that is going off into the stratosphere, hence the segfault on ppc8xx. =20 More analysis to come. =20 On 7/5/05, David Ho [off-list ref] wrote:quoted
First pass debugging results from gdb on ppc8xx. Executing ssh-keygen with following arguments. (gdb) show args Argument list to give program being debugged when it is started is "-t rsa1 -f /etc/ssh/ssh_host_key -N """. Program received signal SIGSEGV, Segmentation fault. BN_bn2dec (a=3D0x1002d9f0) at bn_print.c:136 136 *lp=3DBN_div_word(t,BN_DEC_CONV); (gdb) i r r0 0x0 0 r1 0x7fffd580 2147472768 r2 0x30012868 805382248 r3 0x80000000 2147483648 r4 0xfef33fc 267334652 r5 0x25 37 r6 0xfccdef8 265084664 r7 0x7fffd4c0 2147472576 r8 0xfbad2887 4222429319 r9 0x84044022 2214871074 r10 0x0 0 r11 0x2 2 r12 0xfef2054 267329620 r13 0x10030bc8 268635080 r14 0x0 0 r15 0x0 0 r16 0x0 0 r17 0x0 0 r18 0x0 0 r19 0x0 0 r20 0x0 0 r21 0x0 0 r22 0x0 0 r23 0x64 100 r24 0x5 5 r25 0x1002d438 268620856 r26 0x1002d9f0 268622320 r27 0x1002c578 268617080 r28 0x1 1 r29 0x10031000 268636160 r30 0xffbf7d0 268171216 r31 0x1002d9f0 268622320 pc 0xfef2058 267329624 ps 0xd032 53298 cr 0x24044022 604258338 lr 0xfef2054 267329620 ctr 0xfccefa0 265088928 xer 0x20000000 536870912 fpscr 0x0 0 vscr 0x0 0 vrsave 0x0 0 (gdb) p/x $pc $1 =3D 0xfef2058 0x0fef2058 <BN_bn2dec+472>: stw r3,0(r29) (gdb) x 0x10031000 0x10031000: Cannot access memory at address 0x10031000 On 7/5/05, David Ho [off-list ref] wrote:quoted
This is the second confirmed report of the same problem on the ppc8xx=
.
quoted
quoted
After reading my email. I must say I was the unfriendly one, I apologize for that. More debugging evidence to come. ---------- Forwarded message ---------- From: Murch, Christopher <redacted> Date: Jul 1, 2005 9:46 AM Subject: RE: PPC bn_div_words routine rewrite To: David Ho <redacted> David, I had observed the same issue on ppc 8xx machines after upgrading to =
the asm
quoted
quoted
version of the BN routines. Thank you very much for your work for th=
e fix.
quoted
quoted
My question is, do you have high confidence in the other new asm ppc =
BN
quoted
quoted
routines after observing this issue or do you think they might have s=
imiliar
quoted
quoted
problems? Thanks. Chris -----Original Message----- From: David Ho [mailto:davidkwho@gmail.com] Sent: Thursday, June 30, 2005 6:22 PM To: openssl-dev@openssl.org; linuxppc-embedded@ozlabs.org Subject: Re: PPC bn_div_words routine rewrite The reason I had to redo this routine, in case anyone is wondering, i=
s
quoted
quoted
because ssh-keygen segfaults when this assembly routine returns junk to the BN_div_word function. On a ppc, if you issue the command ssh-keygen -t rsa1 -f /etc/ssh/ssh_host_key -N "" The program craps out when it tries to write the public key in ascii decimal. Regards, David On 6/30/05, David Ho [off-list ref] wrote:quoted
Hi all, This is a rewrite of the bn_div_words routine for the PowerPC arch, tested on a MPC8xx processor. I initially thought there is maybe a small mistake in the code that requires a one-liner change but it turns out I have to redo the routine. I guess this routine is not called very often as I see that most ot=
her
quoted
quoted
quoted
routines are hand-crafted, whereas this routine is compiled from a =
C
quoted
quoted
quoted
function that apparently has not gone through a whole lot of testin=
g.
quoted
quoted
quoted
I wrote a C function to confirm correctness of the code. unsigned long div_words (unsigned long h, unsigned long l, unsigned long d) { unsigned long i_h; /* intermediate dividend */ unsigned long i_q; /* quotient of i_h/d */ unsigned long i_r; /* remainder of i_h/d */ unsigned long i_cntr; unsigned long i_carry; unsigned long ret_q; /* return quotient */ /* cannot divide by zero */ if (d =3D=3D 0) return 0xffffffff; /* do simple 32-bit divide */ if (h =3D=3D 0) return l/d; i_q =3D h/d; i_r =3D h - (i_q*d); ret_q =3D i_q; i_cntr =3D 32; while (i_cntr--) { i_carry =3D (l & 0x80000000) ? 1:0; l =3D l << 1; i_h =3D (i_r << 1) | i_carry; i_q =3D i_h/d; i_r =3D i_h - (i_q*d); ret_q =3D (ret_q << 1) | i_q; } return ret_q; } Then I handcrafted the routine in PPC assembly. The result is a 26 line assembly that is easy to understand and predictable as opposed to a 81liner that I am still trying to decipher... If anyone is interested in incorporating this routine to the openss=
l
quoted
quoted
quoted
code I'll be happy to assist. At this point I think I will be taking a bit of a break from this 3 day debugging/fixing marathon. Regards, David Ho # # Handcrafted version of bn_div_words # # r3 =3D h # r4 =3D l # r5 =3D d cmplwi 0,r5,0 # compare r5 and 0 bc BO_IF_NOT,CR0_EQ,.Lppcasm_div1 # proceed if d!=3D0 li r3,-1 # d=3D0 return -1 bclr BO_ALWAYS,CR0_LT .Lppcasm_div1: cmplwi 0,r3,0 # compare r3 and 0 bc BO_IF_NOT,CR0_EQ,.Lppcasm_div2 # proceed if h !=3D=
0
quoted
quoted
quoted
divwu r3,r4,r5 # ret_q =3D l/d bclr BO_ALWAYS,CR0_LT # return result in r3 .Lppcasm_div2: divwu r9,r3,r5 # i_q =3D h/d mullw r10,r9,r5 # i_r =3D h - (i_q*d) subf r10,r10,r3 mr r3,r9 # req_q =3D i_q .Lppcasm_set_ctr: li r12,32 # ctr =3D bitsizeof(d) mtctr r12 .Lppcasm_div_loop: addc r4,r4,r4 # l =3D l << 1 -> i_carry adde r11,r10,r10 # i_h =3D (i_r << 1) | i_ca=
rry
quoted
quoted
quoted
divwu r9,r11,r5 # i_q =3D i_h/d mullw r10,r9,r5 # i_r =3D i_h - (i_q*d) subf r10,r10,r11 add r3,r3,r3 # ret_q =3D ret_q << 1 | i_=
q
quoted
quoted
quoted
add r3,r3,r9 bc BO_dCTR_NZERO,CR0_EQ,.Lppcasm_div_loop .Lppc_div_end: bclr BO_ALWAYS,CR0_LT # return result in r3 .long 0x00000000_______________________________________________ Linuxppc-embedded mailing list Linuxppc-embedded@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-embedded