Annotation of sys/arch/m68k/fpsp/slogn.sa, Revision 1.1
1.1 ! nbrk 1: * $OpenBSD: slogn.sa,v 1.3 2003/11/07 10:36:10 miod Exp $
! 2: * $NetBSD: slogn.sa,v 1.3 1994/10/26 07:49:54 cgd Exp $
! 3:
! 4: * MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
! 5: * M68000 Hi-Performance Microprocessor Division
! 6: * M68040 Software Package
! 7: *
! 8: * M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
! 9: * All rights reserved.
! 10: *
! 11: * THE SOFTWARE is provided on an "AS IS" basis and without warranty.
! 12: * To the maximum extent permitted by applicable law,
! 13: * MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
! 14: * INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
! 15: * PARTICULAR PURPOSE and any warranty against infringement with
! 16: * regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
! 17: * and any accompanying written materials.
! 18: *
! 19: * To the maximum extent permitted by applicable law,
! 20: * IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
! 21: * (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
! 22: * PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
! 23: * OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
! 24: * SOFTWARE. Motorola assumes no responsibility for the maintenance
! 25: * and support of the SOFTWARE.
! 26: *
! 27: * You are hereby granted a copyright license to use, modify, and
! 28: * distribute the SOFTWARE so long as this entire notice is retained
! 29: * without alteration in any modified and/or redistributed versions,
! 30: * and that such modified versions are clearly identified as such.
! 31: * No licenses are granted by implication, estoppel or otherwise
! 32: * under any patents or trademarks of Motorola, Inc.
! 33:
! 34: *
! 35: * slogn.sa 3.1 12/10/90
! 36: *
! 37: * slogn computes the natural logarithm of an
! 38: * input value. slognd does the same except the input value is a
! 39: * denormalized number. slognp1 computes log(1+X), and slognp1d
! 40: * computes log(1+X) for denormalized X.
! 41: *
! 42: * Input: Double-extended value in memory location pointed to by address
! 43: * register a0.
! 44: *
! 45: * Output: log(X) or log(1+X) returned in floating-point register Fp0.
! 46: *
! 47: * Accuracy and Monotonicity: The returned result is within 2 ulps in
! 48: * 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the
! 49: * result is subsequently rounded to double precision. The
! 50: * result is provably monotonic in double precision.
! 51: *
! 52: * Speed: The program slogn takes approximately 190 cycles for input
! 53: * argument X such that |X-1| >= 1/16, which is the usual
! 54: * situation. For those arguments, slognp1 takes approximately
! 55: * 210 cycles. For the less common arguments, the program will
! 56: * run no worse than 10% slower.
! 57: *
! 58: * Algorithm:
! 59: * LOGN:
! 60: * Step 1. If |X-1| < 1/16, approximate log(X) by an odd polynomial in
! 61: * u, where u = 2(X-1)/(X+1). Otherwise, move on to Step 2.
! 62: *
! 63: * Step 2. X = 2**k * Y where 1 <= Y < 2. Define F to be the first seven
! 64: * significant bits of Y plus 2**(-7), i.e. F = 1.xxxxxx1 in base
! 65: * 2 where the six "x" match those of Y. Note that |Y-F| <= 2**(-7).
! 66: *
! 67: * Step 3. Define u = (Y-F)/F. Approximate log(1+u) by a polynomial in u,
! 68: * log(1+u) = poly.
! 69: *
! 70: * Step 4. Reconstruct log(X) = log( 2**k * Y ) = k*log(2) + log(F) + log(1+u)
! 71: * by k*log(2) + (log(F) + poly). The values of log(F) are calculated
! 72: * beforehand and stored in the program.
! 73: *
! 74: * lognp1:
! 75: * Step 1: If |X| < 1/16, approximate log(1+X) by an odd polynomial in
! 76: * u where u = 2X/(2+X). Otherwise, move on to Step 2.
! 77: *
! 78: * Step 2: Let 1+X = 2**k * Y, where 1 <= Y < 2. Define F as done in Step 2
! 79: * of the algorithm for LOGN and compute log(1+X) as
! 80: * k*log(2) + log(F) + poly where poly approximates log(1+u),
! 81: * u = (Y-F)/F.
! 82: *
! 83: * Implementation Notes:
! 84: * Note 1. There are 64 different possible values for F, thus 64 log(F)'s
! 85: * need to be tabulated. Moreover, the values of 1/F are also
! 86: * tabulated so that the division in (Y-F)/F can be performed by a
! 87: * multiplication.
! 88: *
! 89: * Note 2. In Step 2 of lognp1, in order to preserved accuracy, the value
! 90: * Y-F has to be calculated carefully when 1/2 <= X < 3/2.
! 91: *
! 92: * Note 3. To fully exploit the pipeline, polynomials are usually separated
! 93: * into two parts evaluated independently before being added up.
! 94: *
! 95:
! 96: slogn IDNT 2,1 Motorola 040 Floating Point Software Package
! 97:
! 98: section 8
! 99:
! 100: include fpsp.h
! 101:
! 102: BOUNDS1 DC.L $3FFEF07D,$3FFF8841
! 103: BOUNDS2 DC.L $3FFE8000,$3FFFC000
! 104:
! 105: LOGOF2 DC.L $3FFE0000,$B17217F7,$D1CF79AC,$00000000
! 106:
! 107: one DC.L $3F800000
! 108: zero DC.L $00000000
! 109: infty DC.L $7F800000
! 110: negone DC.L $BF800000
! 111:
! 112: LOGA6 DC.L $3FC2499A,$B5E4040B
! 113: LOGA5 DC.L $BFC555B5,$848CB7DB
! 114:
! 115: LOGA4 DC.L $3FC99999,$987D8730
! 116: LOGA3 DC.L $BFCFFFFF,$FF6F7E97
! 117:
! 118: LOGA2 DC.L $3FD55555,$555555A4
! 119: LOGA1 DC.L $BFE00000,$00000008
! 120:
! 121: LOGB5 DC.L $3F175496,$ADD7DAD6
! 122: LOGB4 DC.L $3F3C71C2,$FE80C7E0
! 123:
! 124: LOGB3 DC.L $3F624924,$928BCCFF
! 125: LOGB2 DC.L $3F899999,$999995EC
! 126:
! 127: LOGB1 DC.L $3FB55555,$55555555
! 128: TWO DC.L $40000000,$00000000
! 129:
! 130: LTHOLD DC.L $3f990000,$80000000,$00000000,$00000000
! 131:
! 132: LOGTBL:
! 133: DC.L $3FFE0000,$FE03F80F,$E03F80FE,$00000000
! 134: DC.L $3FF70000,$FF015358,$833C47E2,$00000000
! 135: DC.L $3FFE0000,$FA232CF2,$52138AC0,$00000000
! 136: DC.L $3FF90000,$BDC8D83E,$AD88D549,$00000000
! 137: DC.L $3FFE0000,$F6603D98,$0F6603DA,$00000000
! 138: DC.L $3FFA0000,$9CF43DCF,$F5EAFD48,$00000000
! 139: DC.L $3FFE0000,$F2B9D648,$0F2B9D65,$00000000
! 140: DC.L $3FFA0000,$DA16EB88,$CB8DF614,$00000000
! 141: DC.L $3FFE0000,$EF2EB71F,$C4345238,$00000000
! 142: DC.L $3FFB0000,$8B29B775,$1BD70743,$00000000
! 143: DC.L $3FFE0000,$EBBDB2A5,$C1619C8C,$00000000
! 144: DC.L $3FFB0000,$A8D839F8,$30C1FB49,$00000000
! 145: DC.L $3FFE0000,$E865AC7B,$7603A197,$00000000
! 146: DC.L $3FFB0000,$C61A2EB1,$8CD907AD,$00000000
! 147: DC.L $3FFE0000,$E525982A,$F70C880E,$00000000
! 148: DC.L $3FFB0000,$E2F2A47A,$DE3A18AF,$00000000
! 149: DC.L $3FFE0000,$E1FC780E,$1FC780E2,$00000000
! 150: DC.L $3FFB0000,$FF64898E,$DF55D551,$00000000
! 151: DC.L $3FFE0000,$DEE95C4C,$A037BA57,$00000000
! 152: DC.L $3FFC0000,$8DB956A9,$7B3D0148,$00000000
! 153: DC.L $3FFE0000,$DBEB61EE,$D19C5958,$00000000
! 154: DC.L $3FFC0000,$9B8FE100,$F47BA1DE,$00000000
! 155: DC.L $3FFE0000,$D901B203,$6406C80E,$00000000
! 156: DC.L $3FFC0000,$A9372F1D,$0DA1BD17,$00000000
! 157: DC.L $3FFE0000,$D62B80D6,$2B80D62C,$00000000
! 158: DC.L $3FFC0000,$B6B07F38,$CE90E46B,$00000000
! 159: DC.L $3FFE0000,$D3680D36,$80D3680D,$00000000
! 160: DC.L $3FFC0000,$C3FD0329,$06488481,$00000000
! 161: DC.L $3FFE0000,$D0B69FCB,$D2580D0B,$00000000
! 162: DC.L $3FFC0000,$D11DE0FF,$15AB18CA,$00000000
! 163: DC.L $3FFE0000,$CE168A77,$25080CE1,$00000000
! 164: DC.L $3FFC0000,$DE1433A1,$6C66B150,$00000000
! 165: DC.L $3FFE0000,$CB8727C0,$65C393E0,$00000000
! 166: DC.L $3FFC0000,$EAE10B5A,$7DDC8ADD,$00000000
! 167: DC.L $3FFE0000,$C907DA4E,$871146AD,$00000000
! 168: DC.L $3FFC0000,$F7856E5E,$E2C9B291,$00000000
! 169: DC.L $3FFE0000,$C6980C69,$80C6980C,$00000000
! 170: DC.L $3FFD0000,$82012CA5,$A68206D7,$00000000
! 171: DC.L $3FFE0000,$C4372F85,$5D824CA6,$00000000
! 172: DC.L $3FFD0000,$882C5FCD,$7256A8C5,$00000000
! 173: DC.L $3FFE0000,$C1E4BBD5,$95F6E947,$00000000
! 174: DC.L $3FFD0000,$8E44C60B,$4CCFD7DE,$00000000
! 175: DC.L $3FFE0000,$BFA02FE8,$0BFA02FF,$00000000
! 176: DC.L $3FFD0000,$944AD09E,$F4351AF6,$00000000
! 177: DC.L $3FFE0000,$BD691047,$07661AA3,$00000000
! 178: DC.L $3FFD0000,$9A3EECD4,$C3EAA6B2,$00000000
! 179: DC.L $3FFE0000,$BB3EE721,$A54D880C,$00000000
! 180: DC.L $3FFD0000,$A0218434,$353F1DE8,$00000000
! 181: DC.L $3FFE0000,$B92143FA,$36F5E02E,$00000000
! 182: DC.L $3FFD0000,$A5F2FCAB,$BBC506DA,$00000000
! 183: DC.L $3FFE0000,$B70FBB5A,$19BE3659,$00000000
! 184: DC.L $3FFD0000,$ABB3B8BA,$2AD362A5,$00000000
! 185: DC.L $3FFE0000,$B509E68A,$9B94821F,$00000000
! 186: DC.L $3FFD0000,$B1641795,$CE3CA97B,$00000000
! 187: DC.L $3FFE0000,$B30F6352,$8917C80B,$00000000
! 188: DC.L $3FFD0000,$B7047551,$5D0F1C61,$00000000
! 189: DC.L $3FFE0000,$B11FD3B8,$0B11FD3C,$00000000
! 190: DC.L $3FFD0000,$BC952AFE,$EA3D13E1,$00000000
! 191: DC.L $3FFE0000,$AF3ADDC6,$80AF3ADE,$00000000
! 192: DC.L $3FFD0000,$C2168ED0,$F458BA4A,$00000000
! 193: DC.L $3FFE0000,$AD602B58,$0AD602B6,$00000000
! 194: DC.L $3FFD0000,$C788F439,$B3163BF1,$00000000
! 195: DC.L $3FFE0000,$AB8F69E2,$8359CD11,$00000000
! 196: DC.L $3FFD0000,$CCECAC08,$BF04565D,$00000000
! 197: DC.L $3FFE0000,$A9C84A47,$A07F5638,$00000000
! 198: DC.L $3FFD0000,$D2420487,$2DD85160,$00000000
! 199: DC.L $3FFE0000,$A80A80A8,$0A80A80B,$00000000
! 200: DC.L $3FFD0000,$D7894992,$3BC3588A,$00000000
! 201: DC.L $3FFE0000,$A655C439,$2D7B73A8,$00000000
! 202: DC.L $3FFD0000,$DCC2C4B4,$9887DACC,$00000000
! 203: DC.L $3FFE0000,$A4A9CF1D,$96833751,$00000000
! 204: DC.L $3FFD0000,$E1EEBD3E,$6D6A6B9E,$00000000
! 205: DC.L $3FFE0000,$A3065E3F,$AE7CD0E0,$00000000
! 206: DC.L $3FFD0000,$E70D785C,$2F9F5BDC,$00000000
! 207: DC.L $3FFE0000,$A16B312E,$A8FC377D,$00000000
! 208: DC.L $3FFD0000,$EC1F392C,$5179F283,$00000000
! 209: DC.L $3FFE0000,$9FD809FD,$809FD80A,$00000000
! 210: DC.L $3FFD0000,$F12440D3,$E36130E6,$00000000
! 211: DC.L $3FFE0000,$9E4CAD23,$DD5F3A20,$00000000
! 212: DC.L $3FFD0000,$F61CCE92,$346600BB,$00000000
! 213: DC.L $3FFE0000,$9CC8E160,$C3FB19B9,$00000000
! 214: DC.L $3FFD0000,$FB091FD3,$8145630A,$00000000
! 215: DC.L $3FFE0000,$9B4C6F9E,$F03A3CAA,$00000000
! 216: DC.L $3FFD0000,$FFE97042,$BFA4C2AD,$00000000
! 217: DC.L $3FFE0000,$99D722DA,$BDE58F06,$00000000
! 218: DC.L $3FFE0000,$825EFCED,$49369330,$00000000
! 219: DC.L $3FFE0000,$9868C809,$868C8098,$00000000
! 220: DC.L $3FFE0000,$84C37A7A,$B9A905C9,$00000000
! 221: DC.L $3FFE0000,$97012E02,$5C04B809,$00000000
! 222: DC.L $3FFE0000,$87224C2E,$8E645FB7,$00000000
! 223: DC.L $3FFE0000,$95A02568,$095A0257,$00000000
! 224: DC.L $3FFE0000,$897B8CAC,$9F7DE298,$00000000
! 225: DC.L $3FFE0000,$94458094,$45809446,$00000000
! 226: DC.L $3FFE0000,$8BCF55DE,$C4CD05FE,$00000000
! 227: DC.L $3FFE0000,$92F11384,$0497889C,$00000000
! 228: DC.L $3FFE0000,$8E1DC0FB,$89E125E5,$00000000
! 229: DC.L $3FFE0000,$91A2B3C4,$D5E6F809,$00000000
! 230: DC.L $3FFE0000,$9066E68C,$955B6C9B,$00000000
! 231: DC.L $3FFE0000,$905A3863,$3E06C43B,$00000000
! 232: DC.L $3FFE0000,$92AADE74,$C7BE59E0,$00000000
! 233: DC.L $3FFE0000,$8F1779D9,$FDC3A219,$00000000
! 234: DC.L $3FFE0000,$94E9BFF6,$15845643,$00000000
! 235: DC.L $3FFE0000,$8DDA5202,$37694809,$00000000
! 236: DC.L $3FFE0000,$9723A1B7,$20134203,$00000000
! 237: DC.L $3FFE0000,$8CA29C04,$6514E023,$00000000
! 238: DC.L $3FFE0000,$995899C8,$90EB8990,$00000000
! 239: DC.L $3FFE0000,$8B70344A,$139BC75A,$00000000
! 240: DC.L $3FFE0000,$9B88BDAA,$3A3DAE2F,$00000000
! 241: DC.L $3FFE0000,$8A42F870,$5669DB46,$00000000
! 242: DC.L $3FFE0000,$9DB4224F,$FFE1157C,$00000000
! 243: DC.L $3FFE0000,$891AC73A,$E9819B50,$00000000
! 244: DC.L $3FFE0000,$9FDADC26,$8B7A12DA,$00000000
! 245: DC.L $3FFE0000,$87F78087,$F78087F8,$00000000
! 246: DC.L $3FFE0000,$A1FCFF17,$CE733BD4,$00000000
! 247: DC.L $3FFE0000,$86D90544,$7A34ACC6,$00000000
! 248: DC.L $3FFE0000,$A41A9E8F,$5446FB9F,$00000000
! 249: DC.L $3FFE0000,$85BF3761,$2CEE3C9B,$00000000
! 250: DC.L $3FFE0000,$A633CD7E,$6771CD8B,$00000000
! 251: DC.L $3FFE0000,$84A9F9C8,$084A9F9D,$00000000
! 252: DC.L $3FFE0000,$A8489E60,$0B435A5E,$00000000
! 253: DC.L $3FFE0000,$83993052,$3FBE3368,$00000000
! 254: DC.L $3FFE0000,$AA59233C,$CCA4BD49,$00000000
! 255: DC.L $3FFE0000,$828CBFBE,$B9A020A3,$00000000
! 256: DC.L $3FFE0000,$AC656DAE,$6BCC4985,$00000000
! 257: DC.L $3FFE0000,$81848DA8,$FAF0D277,$00000000
! 258: DC.L $3FFE0000,$AE6D8EE3,$60BB2468,$00000000
! 259: DC.L $3FFE0000,$80808080,$80808081,$00000000
! 260: DC.L $3FFE0000,$B07197A2,$3C46C654,$00000000
! 261:
! 262: ADJK equ L_SCR1
! 263:
! 264: X equ FP_SCR1
! 265: XDCARE equ X+2
! 266: XFRAC equ X+4
! 267:
! 268: F equ FP_SCR2
! 269: FFRAC equ F+4
! 270:
! 271: KLOG2 equ FP_SCR3
! 272:
! 273: SAVEU equ FP_SCR4
! 274:
! 275: xref t_frcinx
! 276: xref t_extdnrm
! 277: xref t_operr
! 278: xref t_dz
! 279:
! 280: xdef slognd
! 281: slognd:
! 282: *--ENTRY POINT FOR LOG(X) FOR DENORMALIZED INPUT
! 283:
! 284: MOVE.L #-100,ADJK(a6) ...INPUT = 2^(ADJK) * FP0
! 285:
! 286: *----normalize the input value by left shifting k bits (k to be determined
! 287: *----below), adjusting exponent and storing -k to ADJK
! 288: *----the value TWOTO100 is no longer needed.
! 289: *----Note that this code assumes the denormalized input is NON-ZERO.
! 290:
! 291: MoveM.L D2-D7,-(A7) ...save some registers
! 292: Clr.L D3 ...D3 is exponent of smallest norm. #
! 293: Move.L 4(A0),D4
! 294: Move.L 8(A0),D5 ...(D4,D5) is (Hi_X,Lo_X)
! 295: Clr.L D2 ...D2 used for holding K
! 296:
! 297: Tst.L D4
! 298: BNE.B HiX_not0
! 299:
! 300: HiX_0:
! 301: Move.L D5,D4
! 302: Clr.L D5
! 303: Move.L #32,D2
! 304: Clr.L D6
! 305: BFFFO D4{0:32},D6
! 306: LSL.L D6,D4
! 307: Add.L D6,D2 ...(D3,D4,D5) is normalized
! 308:
! 309: Move.L D3,X(a6)
! 310: Move.L D4,XFRAC(a6)
! 311: Move.L D5,XFRAC+4(a6)
! 312: Neg.L D2
! 313: Move.L D2,ADJK(a6)
! 314: FMove.X X(a6),FP0
! 315: MoveM.L (A7)+,D2-D7 ...restore registers
! 316: LEA X(a6),A0
! 317: Bra.B LOGBGN ...begin regular log(X)
! 318:
! 319:
! 320: HiX_not0:
! 321: Clr.L D6
! 322: BFFFO D4{0:32},D6 ...find first 1
! 323: Move.L D6,D2 ...get k
! 324: LSL.L D6,D4
! 325: Move.L D5,D7 ...a copy of D5
! 326: LSL.L D6,D5
! 327: Neg.L D6
! 328: AddI.L #32,D6
! 329: LSR.L D6,D7
! 330: Or.L D7,D4 ...(D3,D4,D5) normalized
! 331:
! 332: Move.L D3,X(a6)
! 333: Move.L D4,XFRAC(a6)
! 334: Move.L D5,XFRAC+4(a6)
! 335: Neg.L D2
! 336: Move.L D2,ADJK(a6)
! 337: FMove.X X(a6),FP0
! 338: MoveM.L (A7)+,D2-D7 ...restore registers
! 339: LEA X(a6),A0
! 340: Bra.B LOGBGN ...begin regular log(X)
! 341:
! 342:
! 343: xdef slogn
! 344: slogn:
! 345: *--ENTRY POINT FOR LOG(X) FOR X FINITE, NON-ZERO, NOT NAN'S
! 346:
! 347: FMOVE.X (A0),FP0 ...LOAD INPUT
! 348: CLR.L ADJK(a6)
! 349:
! 350: LOGBGN:
! 351: *--FPCR SAVED AND CLEARED, INPUT IS 2^(ADJK)*FP0, FP0 CONTAINS
! 352: *--A FINITE, NON-ZERO, NORMALIZED NUMBER.
! 353:
! 354: move.l (a0),d0
! 355: move.w 4(a0),d0
! 356:
! 357: move.l (a0),X(a6)
! 358: move.l 4(a0),X+4(a6)
! 359: move.l 8(a0),X+8(a6)
! 360:
! 361: TST.L D0 ...CHECK IF X IS NEGATIVE
! 362: BLT.W LOGNEG ...LOG OF NEGATIVE ARGUMENT IS INVALID
! 363: CMP2.L BOUNDS1,D0 ...X IS POSITIVE, CHECK IF X IS NEAR 1
! 364: BCC.W LOGNEAR1 ...BOUNDS IS ROUGHLY [15/16, 17/16]
! 365:
! 366: LOGMAIN:
! 367: *--THIS SHOULD BE THE USUAL CASE, X NOT VERY CLOSE TO 1
! 368:
! 369: *--X = 2^(K) * Y, 1 <= Y < 2. THUS, Y = 1.XXXXXXXX....XX IN BINARY.
! 370: *--WE DEFINE F = 1.XXXXXX1, I.E. FIRST 7 BITS OF Y AND ATTACH A 1.
! 371: *--THE IDEA IS THAT LOG(X) = K*LOG2 + LOG(Y)
! 372: *-- = K*LOG2 + LOG(F) + LOG(1 + (Y-F)/F).
! 373: *--NOTE THAT U = (Y-F)/F IS VERY SMALL AND THUS APPROXIMATING
! 374: *--LOG(1+U) CAN BE VERY EFFICIENT.
! 375: *--ALSO NOTE THAT THE VALUE 1/F IS STORED IN A TABLE SO THAT NO
! 376: *--DIVISION IS NEEDED TO CALCULATE (Y-F)/F.
! 377:
! 378: *--GET K, Y, F, AND ADDRESS OF 1/F.
! 379: ASR.L #8,D0
! 380: ASR.L #8,D0 ...SHIFTED 16 BITS, BIASED EXPO. OF X
! 381: SUBI.L #$3FFF,D0 ...THIS IS K
! 382: ADD.L ADJK(a6),D0 ...ADJUST K, ORIGINAL INPUT MAY BE DENORM.
! 383: LEA LOGTBL,A0 ...BASE ADDRESS OF 1/F AND LOG(F)
! 384: FMOVE.L D0,FP1 ...CONVERT K TO FLOATING-POINT FORMAT
! 385:
! 386: *--WHILE THE CONVERSION IS GOING ON, WE GET F AND ADDRESS OF 1/F
! 387: MOVE.L #$3FFF0000,X(a6) ...X IS NOW Y, I.E. 2^(-K)*X
! 388: MOVE.L XFRAC(a6),FFRAC(a6)
! 389: ANDI.L #$FE000000,FFRAC(a6) ...FIRST 7 BITS OF Y
! 390: ORI.L #$01000000,FFRAC(a6) ...GET F: ATTACH A 1 AT THE EIGHTH BIT
! 391: MOVE.L FFRAC(a6),D0 ...READY TO GET ADDRESS OF 1/F
! 392: ANDI.L #$7E000000,D0
! 393: ASR.L #8,D0
! 394: ASR.L #8,D0
! 395: ASR.L #4,D0 ...SHIFTED 20, D0 IS THE DISPLACEMENT
! 396: ADDA.L D0,A0 ...A0 IS THE ADDRESS FOR 1/F
! 397:
! 398: FMOVE.X X(a6),FP0
! 399: move.l #$3fff0000,F(a6)
! 400: clr.l F+8(a6)
! 401: FSUB.X F(a6),FP0 ...Y-F
! 402: FMOVEm.X FP2/fp3,-(sp) ...SAVE FP2 WHILE FP0 IS NOT READY
! 403: *--SUMMARY: FP0 IS Y-F, A0 IS ADDRESS OF 1/F, FP1 IS K
! 404: *--REGISTERS SAVED: FPCR, FP1, FP2
! 405:
! 406: LP1CONT1:
! 407: *--AN RE-ENTRY POINT FOR LOGNP1
! 408: FMUL.X (A0),FP0 ...FP0 IS U = (Y-F)/F
! 409: FMUL.X LOGOF2,FP1 ...GET K*LOG2 WHILE FP0 IS NOT READY
! 410: FMOVE.X FP0,FP2
! 411: FMUL.X FP2,FP2 ...FP2 IS V=U*U
! 412: FMOVE.X FP1,KLOG2(a6) ...PUT K*LOG2 IN MEMEORY, FREE FP1
! 413:
! 414: *--LOG(1+U) IS APPROXIMATED BY
! 415: *--U + V*(A1+U*(A2+U*(A3+U*(A4+U*(A5+U*A6))))) WHICH IS
! 416: *--[U + V*(A1+V*(A3+V*A5))] + [U*V*(A2+V*(A4+V*A6))]
! 417:
! 418: FMOVE.X FP2,FP3
! 419: FMOVE.X FP2,FP1
! 420:
! 421: FMUL.D LOGA6,FP1 ...V*A6
! 422: FMUL.D LOGA5,FP2 ...V*A5
! 423:
! 424: FADD.D LOGA4,FP1 ...A4+V*A6
! 425: FADD.D LOGA3,FP2 ...A3+V*A5
! 426:
! 427: FMUL.X FP3,FP1 ...V*(A4+V*A6)
! 428: FMUL.X FP3,FP2 ...V*(A3+V*A5)
! 429:
! 430: FADD.D LOGA2,FP1 ...A2+V*(A4+V*A6)
! 431: FADD.D LOGA1,FP2 ...A1+V*(A3+V*A5)
! 432:
! 433: FMUL.X FP3,FP1 ...V*(A2+V*(A4+V*A6))
! 434: ADDA.L #16,A0 ...ADDRESS OF LOG(F)
! 435: FMUL.X FP3,FP2 ...V*(A1+V*(A3+V*A5)), FP3 RELEASED
! 436:
! 437: FMUL.X FP0,FP1 ...U*V*(A2+V*(A4+V*A6))
! 438: FADD.X FP2,FP0 ...U+V*(A1+V*(A3+V*A5)), FP2 RELEASED
! 439:
! 440: FADD.X (A0),FP1 ...LOG(F)+U*V*(A2+V*(A4+V*A6))
! 441: FMOVEm.X (sp)+,FP2/fp3 ...RESTORE FP2
! 442: FADD.X FP1,FP0 ...FP0 IS LOG(F) + LOG(1+U)
! 443:
! 444: fmove.l d1,fpcr
! 445: FADD.X KLOG2(a6),FP0 ...FINAL ADD
! 446: bra t_frcinx
! 447:
! 448:
! 449: LOGNEAR1:
! 450: *--REGISTERS SAVED: FPCR, FP1. FP0 CONTAINS THE INPUT.
! 451: FMOVE.X FP0,FP1
! 452: FSUB.S one,FP1 ...FP1 IS X-1
! 453: FADD.S one,FP0 ...FP0 IS X+1
! 454: FADD.X FP1,FP1 ...FP1 IS 2(X-1)
! 455: *--LOG(X) = LOG(1+U/2)-LOG(1-U/2) WHICH IS AN ODD POLYNOMIAL
! 456: *--IN U, U = 2(X-1)/(X+1) = FP1/FP0
! 457:
! 458: LP1CONT2:
! 459: *--THIS IS AN RE-ENTRY POINT FOR LOGNP1
! 460: FDIV.X FP0,FP1 ...FP1 IS U
! 461: FMOVEm.X FP2/fp3,-(sp) ...SAVE FP2
! 462: *--REGISTERS SAVED ARE NOW FPCR,FP1,FP2,FP3
! 463: *--LET V=U*U, W=V*V, CALCULATE
! 464: *--U + U*V*(B1 + V*(B2 + V*(B3 + V*(B4 + V*B5)))) BY
! 465: *--U + U*V*( [B1 + W*(B3 + W*B5)] + [V*(B2 + W*B4)] )
! 466: FMOVE.X FP1,FP0
! 467: FMUL.X FP0,FP0 ...FP0 IS V
! 468: FMOVE.X FP1,SAVEU(a6) ...STORE U IN MEMORY, FREE FP1
! 469: FMOVE.X FP0,FP1
! 470: FMUL.X FP1,FP1 ...FP1 IS W
! 471:
! 472: FMOVE.D LOGB5,FP3
! 473: FMOVE.D LOGB4,FP2
! 474:
! 475: FMUL.X FP1,FP3 ...W*B5
! 476: FMUL.X FP1,FP2 ...W*B4
! 477:
! 478: FADD.D LOGB3,FP3 ...B3+W*B5
! 479: FADD.D LOGB2,FP2 ...B2+W*B4
! 480:
! 481: FMUL.X FP3,FP1 ...W*(B3+W*B5), FP3 RELEASED
! 482:
! 483: FMUL.X FP0,FP2 ...V*(B2+W*B4)
! 484:
! 485: FADD.D LOGB1,FP1 ...B1+W*(B3+W*B5)
! 486: FMUL.X SAVEU(a6),FP0 ...FP0 IS U*V
! 487:
! 488: FADD.X FP2,FP1 ...B1+W*(B3+W*B5) + V*(B2+W*B4), FP2 RELEASED
! 489: FMOVEm.X (sp)+,FP2/fp3 ...FP2 RESTORED
! 490:
! 491: FMUL.X FP1,FP0 ...U*V*( [B1+W*(B3+W*B5)] + [V*(B2+W*B4)] )
! 492:
! 493: fmove.l d1,fpcr
! 494: FADD.X SAVEU(a6),FP0
! 495: bra t_frcinx
! 496: rts
! 497:
! 498: LOGNEG:
! 499: *--REGISTERS SAVED FPCR. LOG(-VE) IS INVALID
! 500: bra t_operr
! 501:
! 502: xdef slognp1d
! 503: slognp1d:
! 504: *--ENTRY POINT FOR LOG(1+Z) FOR DENORMALIZED INPUT
! 505: * Simply return the denorm
! 506:
! 507: bra t_extdnrm
! 508:
! 509: xdef slognp1
! 510: slognp1:
! 511: *--ENTRY POINT FOR LOG(1+X) FOR X FINITE, NON-ZERO, NOT NAN'S
! 512:
! 513: FMOVE.X (A0),FP0 ...LOAD INPUT
! 514: fabs.x fp0 ;test magnitude
! 515: fcmp.x LTHOLD,fp0 ;compare with min threshold
! 516: fbgt.w LP1REAL ;if greater, continue
! 517: fmove.l #0,fpsr ;clr N flag from compare
! 518: fmove.l d1,fpcr
! 519: fmove.x (a0),fp0 ;return signed argument
! 520: bra t_frcinx
! 521:
! 522: LP1REAL:
! 523: FMOVE.X (A0),FP0 ...LOAD INPUT
! 524: CLR.L ADJK(a6)
! 525: FMOVE.X FP0,FP1 ...FP1 IS INPUT Z
! 526: FADD.S one,FP0 ...X := ROUND(1+Z)
! 527: FMOVE.X FP0,X(a6)
! 528: MOVE.W XFRAC(a6),XDCARE(a6)
! 529: MOVE.L X(a6),D0
! 530: TST.L D0
! 531: BLE.W LP1NEG0 ...LOG OF ZERO OR -VE
! 532: CMP2.L BOUNDS2,D0
! 533: BCS.W LOGMAIN ...BOUNDS2 IS [1/2,3/2]
! 534: *--IF 1+Z > 3/2 OR 1+Z < 1/2, THEN X, WHICH IS ROUNDING 1+Z,
! 535: *--CONTAINS AT LEAST 63 BITS OF INFORMATION OF Z. IN THAT CASE,
! 536: *--SIMPLY INVOKE LOG(X) FOR LOG(1+Z).
! 537:
! 538: LP1NEAR1:
! 539: *--NEXT SEE IF EXP(-1/16) < X < EXP(1/16)
! 540: CMP2.L BOUNDS1,D0
! 541: BCS.B LP1CARE
! 542:
! 543: LP1ONE16:
! 544: *--EXP(-1/16) < X < EXP(1/16). LOG(1+Z) = LOG(1+U/2) - LOG(1-U/2)
! 545: *--WHERE U = 2Z/(2+Z) = 2Z/(1+X).
! 546: FADD.X FP1,FP1 ...FP1 IS 2Z
! 547: FADD.S one,FP0 ...FP0 IS 1+X
! 548: *--U = FP1/FP0
! 549: BRA.W LP1CONT2
! 550:
! 551: LP1CARE:
! 552: *--HERE WE USE THE USUAL TABLE DRIVEN APPROACH. CARE HAS TO BE
! 553: *--TAKEN BECAUSE 1+Z CAN HAVE 67 BITS OF INFORMATION AND WE MUST
! 554: *--PRESERVE ALL THE INFORMATION. BECAUSE 1+Z IS IN [1/2,3/2],
! 555: *--THERE ARE ONLY TWO CASES.
! 556: *--CASE 1: 1+Z < 1, THEN K = -1 AND Y-F = (2-F) + 2Z
! 557: *--CASE 2: 1+Z > 1, THEN K = 0 AND Y-F = (1-F) + Z
! 558: *--ON RETURNING TO LP1CONT1, WE MUST HAVE K IN FP1, ADDRESS OF
! 559: *--(1/F) IN A0, Y-F IN FP0, AND FP2 SAVED.
! 560:
! 561: MOVE.L XFRAC(a6),FFRAC(a6)
! 562: ANDI.L #$FE000000,FFRAC(a6)
! 563: ORI.L #$01000000,FFRAC(a6) ...F OBTAINED
! 564: CMPI.L #$3FFF8000,D0 ...SEE IF 1+Z > 1
! 565: BGE.B KISZERO
! 566:
! 567: KISNEG1:
! 568: FMOVE.S TWO,FP0
! 569: move.l #$3fff0000,F(a6)
! 570: clr.l F+8(a6)
! 571: FSUB.X F(a6),FP0 ...2-F
! 572: MOVE.L FFRAC(a6),D0
! 573: ANDI.L #$7E000000,D0
! 574: ASR.L #8,D0
! 575: ASR.L #8,D0
! 576: ASR.L #4,D0 ...D0 CONTAINS DISPLACEMENT FOR 1/F
! 577: FADD.X FP1,FP1 ...GET 2Z
! 578: FMOVEm.X FP2/fp3,-(sp) ...SAVE FP2
! 579: FADD.X FP1,FP0 ...FP0 IS Y-F = (2-F)+2Z
! 580: LEA LOGTBL,A0 ...A0 IS ADDRESS OF 1/F
! 581: ADDA.L D0,A0
! 582: FMOVE.S negone,FP1 ...FP1 IS K = -1
! 583: BRA.W LP1CONT1
! 584:
! 585: KISZERO:
! 586: FMOVE.S one,FP0
! 587: move.l #$3fff0000,F(a6)
! 588: clr.l F+8(a6)
! 589: FSUB.X F(a6),FP0 ...1-F
! 590: MOVE.L FFRAC(a6),D0
! 591: ANDI.L #$7E000000,D0
! 592: ASR.L #8,D0
! 593: ASR.L #8,D0
! 594: ASR.L #4,D0
! 595: FADD.X FP1,FP0 ...FP0 IS Y-F
! 596: FMOVEm.X FP2/fp3,-(sp) ...FP2 SAVED
! 597: LEA LOGTBL,A0
! 598: ADDA.L D0,A0 ...A0 IS ADDRESS OF 1/F
! 599: FMOVE.S zero,FP1 ...FP1 IS K = 0
! 600: BRA.W LP1CONT1
! 601:
! 602: LP1NEG0:
! 603: *--FPCR SAVED. D0 IS X IN COMPACT FORM.
! 604: TST.L D0
! 605: BLT.B LP1NEG
! 606: LP1ZERO:
! 607: FMOVE.S negone,FP0
! 608:
! 609: fmove.l d1,fpcr
! 610: bra t_dz
! 611:
! 612: LP1NEG:
! 613: FMOVE.S zero,FP0
! 614:
! 615: fmove.l d1,fpcr
! 616: bra t_operr
! 617:
! 618: end
CVSweb