Annotation of sys/arch/m68k/fpsp/bindec.sa, Revision 1.1
1.1 ! nbrk 1: * $OpenBSD: bindec.sa,v 1.3 2001/09/20 17:02:30 mpech Exp $
! 2: * $NetBSD: bindec.sa,v 1.3 1994/10/26 07:48:51 cgd Exp $
! 3:
! 4: * MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
! 5: * M68000 Hi-Performance Microprocessor Division
! 6: * M68040 Software Package
! 7: *
! 8: * M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
! 9: * All rights reserved.
! 10: *
! 11: * THE SOFTWARE is provided on an "AS IS" basis and without warranty.
! 12: * To the maximum extent permitted by applicable law,
! 13: * MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
! 14: * INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
! 15: * PARTICULAR PURPOSE and any warranty against infringement with
! 16: * regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
! 17: * and any accompanying written materials.
! 18: *
! 19: * To the maximum extent permitted by applicable law,
! 20: * IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
! 21: * (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
! 22: * PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
! 23: * OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
! 24: * SOFTWARE. Motorola assumes no responsibility for the maintenance
! 25: * and support of the SOFTWARE.
! 26: *
! 27: * You are hereby granted a copyright license to use, modify, and
! 28: * distribute the SOFTWARE so long as this entire notice is retained
! 29: * without alteration in any modified and/or redistributed versions,
! 30: * and that such modified versions are clearly identified as such.
! 31: * No licenses are granted by implication, estoppel or otherwise
! 32: * under any patents or trademarks of Motorola, Inc.
! 33:
! 34: *
! 35: * bindec.sa 3.4 1/3/91
! 36: *
! 37: * bindec
! 38: *
! 39: * Description:
! 40: * Converts an input in extended precision format
! 41: * to bcd format.
! 42: *
! 43: * Input:
! 44: * a0 points to the input extended precision value
! 45: * value in memory; d0 contains the k-factor sign-extended
! 46: * to 32-bits. The input may be either normalized,
! 47: * unnormalized, or denormalized.
! 48: *
! 49: * Output: result in the FP_SCR1 space on the stack.
! 50: *
! 51: * Saves and Modifies: D2-D7,A2,FP2
! 52: *
! 53: * Algorithm:
! 54: *
! 55: * A1. Set RM and size ext; Set SIGMA = sign of input.
! 56: * The k-factor is saved for use in d7. Clear the
! 57: * BINDEC_FLG for separating normalized/denormalized
! 58: * input. If input is unnormalized or denormalized,
! 59: * normalize it.
! 60: *
! 61: * A2. Set X = abs(input).
! 62: *
! 63: * A3. Compute ILOG.
! 64: * ILOG is the log base 10 of the input value. It is
! 65: * approximated by adding e + 0.f when the original
! 66: * value is viewed as 2^^e * 1.f in extended precision.
! 67: * This value is stored in d6.
! 68: *
! 69: * A4. Clr INEX bit.
! 70: * The operation in A3 above may have set INEX2.
! 71: *
! 72: * A5. Set ICTR = 0;
! 73: * ICTR is a flag used in A13. It must be set before the
! 74: * loop entry A6.
! 75: *
! 76: * A6. Calculate LEN.
! 77: * LEN is the number of digits to be displayed. The
! 78: * k-factor can dictate either the total number of digits,
! 79: * if it is a positive number, or the number of digits
! 80: * after the decimal point which are to be included as
! 81: * significant. See the 68882 manual for examples.
! 82: * If LEN is computed to be greater than 17, set OPERR in
! 83: * USER_FPSR. LEN is stored in d4.
! 84: *
! 85: * A7. Calculate SCALE.
! 86: * SCALE is equal to 10^ISCALE, where ISCALE is the number
! 87: * of decimal places needed to insure LEN integer digits
! 88: * in the output before conversion to bcd. LAMBDA is the
! 89: * sign of ISCALE, used in A9. Fp1 contains
! 90: * 10^^(abs(ISCALE)) using a rounding mode which is a
! 91: * function of the original rounding mode and the signs
! 92: * of ISCALE and X. A table is given in the code.
! 93: *
! 94: * A8. Clr INEX; Force RZ.
! 95: * The operation in A3 above may have set INEX2.
! 96: * RZ mode is forced for the scaling operation to insure
! 97: * only one rounding error. The grs bits are collected in
! 98: * the INEX flag for use in A10.
! 99: *
! 100: * A9. Scale X -> Y.
! 101: * The mantissa is scaled to the desired number of
! 102: * significant digits. The excess digits are collected
! 103: * in INEX2.
! 104: *
! 105: * A10. Or in INEX.
! 106: * If INEX is set, round error occurred. This is
! 107: * compensated for by 'or-ing' in the INEX2 flag to
! 108: * the lsb of Y.
! 109: *
! 110: * A11. Restore original FPCR; set size ext.
! 111: * Perform FINT operation in the user's rounding mode.
! 112: * Keep the size to extended.
! 113: *
! 114: * A12. Calculate YINT = FINT(Y) according to user's rounding
! 115: * mode. The FPSP routine sintd0 is used. The output
! 116: * is in fp0.
! 117: *
! 118: * A13. Check for LEN digits.
! 119: * If the int operation results in more than LEN digits,
! 120: * or less than LEN -1 digits, adjust ILOG and repeat from
! 121: * A6. This test occurs only on the first pass. If the
! 122: * result is exactly 10^LEN, decrement ILOG and divide
! 123: * the mantissa by 10.
! 124: *
! 125: * A14. Convert the mantissa to bcd.
! 126: * The binstr routine is used to convert the LEN digit
! 127: * mantissa to bcd in memory. The input to binstr is
! 128: * to be a fraction; i.e. (mantissa)/10^LEN and adjusted
! 129: * such that the decimal point is to the left of bit 63.
! 130: * The bcd digits are stored in the correct position in
! 131: * the final string area in memory.
! 132: *
! 133: * A15. Convert the exponent to bcd.
! 134: * As in A14 above, the exp is converted to bcd and the
! 135: * digits are stored in the final string.
! 136: * Test the length of the final exponent string. If the
! 137: * length is 4, set operr.
! 138: *
! 139: * A16. Write sign bits to final string.
! 140: *
! 141: * Implementation Notes:
! 142: *
! 143: * The registers are used as follows:
! 144: *
! 145: * d0: scratch; LEN input to binstr
! 146: * d1: scratch
! 147: * d2: upper 32-bits of mantissa for binstr
! 148: * d3: scratch;lower 32-bits of mantissa for binstr
! 149: * d4: LEN
! 150: * d5: LAMBDA/ICTR
! 151: * d6: ILOG
! 152: * d7: k-factor
! 153: * a0: ptr for original operand/final result
! 154: * a1: scratch pointer
! 155: * a2: pointer to FP_X; abs(original value) in ext
! 156: * fp0: scratch
! 157: * fp1: scratch
! 158: * fp2: scratch
! 159: * F_SCR1:
! 160: * F_SCR2:
! 161: * L_SCR1:
! 162: * L_SCR2:
! 163: *
! 164:
! 165: BINDEC IDNT 2,1 Motorola 040 Floating Point Software Package
! 166:
! 167: include fpsp.h
! 168:
! 169: section 8
! 170:
! 171: * Constants in extended precision
! 172: LOG2 dc.l $3FFD0000,$9A209A84,$FBCFF798,$00000000
! 173: LOG2UP1 dc.l $3FFD0000,$9A209A84,$FBCFF799,$00000000
! 174:
! 175: * Constants in single precision
! 176: FONE dc.l $3F800000,$00000000,$00000000,$00000000
! 177: FTWO dc.l $40000000,$00000000,$00000000,$00000000
! 178: FTEN dc.l $41200000,$00000000,$00000000,$00000000
! 179: F4933 dc.l $459A2800,$00000000,$00000000,$00000000
! 180:
! 181: RBDTBL dc.b 0,0,0,0
! 182: dc.b 3,3,2,2
! 183: dc.b 3,2,2,3
! 184: dc.b 2,3,3,2
! 185:
! 186: xref binstr
! 187: xref sintdo
! 188: xref ptenrn,ptenrm,ptenrp
! 189:
! 190: xdef bindec
! 191: xdef sc_mul
! 192: bindec:
! 193: movem.l d2-d7/a2,-(a7)
! 194: fmovem.x fp0-fp2,-(a7)
! 195:
! 196: * A1. Set RM and size ext. Set SIGMA = sign input;
! 197: * The k-factor is saved for use in d7. Clear BINDEC_FLG for
! 198: * separating normalized/denormalized input. If the input
! 199: * is a denormalized number, set the BINDEC_FLG memory word
! 200: * to signal denorm. If the input is unnormalized, normalize
! 201: * the input and test for denormalized result.
! 202: *
! 203: fmove.l #rm_mode,FPCR ;set RM and ext
! 204: move.l (a0),L_SCR2(a6) ;save exponent for sign check
! 205: move.l d0,d7 ;move k-factor to d7
! 206: clr.b BINDEC_FLG(a6) ;clr norm/denorm flag
! 207: move.w STAG(a6),d0 ;get stag
! 208: andi.w #$e000,d0 ;isolate stag bits
! 209: beq A2_str ;if zero, input is norm
! 210: *
! 211: * Normalize the denorm
! 212: *
! 213: un_de_norm:
! 214: move.w (a0),d0
! 215: andi.w #$7fff,d0 ;strip sign of normalized exp
! 216: move.l 4(a0),d1
! 217: move.l 8(a0),d2
! 218: norm_loop:
! 219: sub.w #1,d0
! 220: add.l d2,d2
! 221: addx.l d1,d1
! 222: tst.l d1
! 223: bge.b norm_loop
! 224: *
! 225: * Test if the normalized input is denormalized
! 226: *
! 227: tst.w d0
! 228: bgt.b pos_exp ;if greater than zero, it is a norm
! 229: st BINDEC_FLG(a6) ;set flag for denorm
! 230: pos_exp:
! 231: andi.w #$7fff,d0 ;strip sign of normalized exp
! 232: move.w d0,(a0)
! 233: move.l d1,4(a0)
! 234: move.l d2,8(a0)
! 235:
! 236: * A2. Set X = abs(input).
! 237: *
! 238: A2_str:
! 239: move.l (a0),FP_SCR2(a6) ; move input to work space
! 240: move.l 4(a0),FP_SCR2+4(a6) ; move input to work space
! 241: move.l 8(a0),FP_SCR2+8(a6) ; move input to work space
! 242: andi.l #$7fffffff,FP_SCR2(a6) ;create abs(X)
! 243:
! 244: * A3. Compute ILOG.
! 245: * ILOG is the log base 10 of the input value. It is approx-
! 246: * imated by adding e + 0.f when the original value is viewed
! 247: * as 2^^e * 1.f in extended precision. This value is stored
! 248: * in d6.
! 249: *
! 250: * Register usage:
! 251: * Input/Output
! 252: * d0: k-factor/exponent
! 253: * d2: x/x
! 254: * d3: x/x
! 255: * d4: x/x
! 256: * d5: x/x
! 257: * d6: x/ILOG
! 258: * d7: k-factor/Unchanged
! 259: * a0: ptr for original operand/final result
! 260: * a1: x/x
! 261: * a2: x/x
! 262: * fp0: x/float(ILOG)
! 263: * fp1: x/x
! 264: * fp2: x/x
! 265: * F_SCR1:x/x
! 266: * F_SCR2:Abs(X)/Abs(X) with $3fff exponent
! 267: * L_SCR1:x/x
! 268: * L_SCR2:first word of X packed/Unchanged
! 269:
! 270: tst.b BINDEC_FLG(a6) ;check for denorm
! 271: beq.b A3_cont ;if clr, continue with norm
! 272: move.l #-4933,d6 ;force ILOG = -4933
! 273: bra.b A4_str
! 274: A3_cont:
! 275: move.w FP_SCR2(a6),d0 ;move exp to d0
! 276: move.w #$3fff,FP_SCR2(a6) ;replace exponent with 0x3fff
! 277: fmove.x FP_SCR2(a6),fp0 ;now fp0 has 1.f
! 278: sub.w #$3fff,d0 ;strip off bias
! 279: fadd.w d0,fp0 ;add in exp
! 280: fsub.s FONE,fp0 ;subtract off 1.0
! 281: fbge.w pos_res ;if pos, branch
! 282: fmul.x LOG2UP1,fp0 ;if neg, mul by LOG2UP1
! 283: fmove.l fp0,d6 ;put ILOG in d6 as a lword
! 284: bra.b A4_str ;go move out ILOG
! 285: pos_res:
! 286: fmul.x LOG2,fp0 ;if pos, mul by LOG2
! 287: fmove.l fp0,d6 ;put ILOG in d6 as a lword
! 288:
! 289:
! 290: * A4. Clr INEX bit.
! 291: * The operation in A3 above may have set INEX2.
! 292:
! 293: A4_str:
! 294: fmove.l #0,FPSR ;zero all of fpsr - nothing needed
! 295:
! 296:
! 297: * A5. Set ICTR = 0;
! 298: * ICTR is a flag used in A13. It must be set before the
! 299: * loop entry A6. The lower word of d5 is used for ICTR.
! 300:
! 301: clr.w d5 ;clear ICTR
! 302:
! 303:
! 304: * A6. Calculate LEN.
! 305: * LEN is the number of digits to be displayed. The k-factor
! 306: * can dictate either the total number of digits, if it is
! 307: * a positive number, or the number of digits after the
! 308: * original decimal point which are to be included as
! 309: * significant. See the 68882 manual for examples.
! 310: * If LEN is computed to be greater than 17, set OPERR in
! 311: * USER_FPSR. LEN is stored in d4.
! 312: *
! 313: * Register usage:
! 314: * Input/Output
! 315: * d0: exponent/Unchanged
! 316: * d2: x/x/scratch
! 317: * d3: x/x
! 318: * d4: exc picture/LEN
! 319: * d5: ICTR/Unchanged
! 320: * d6: ILOG/Unchanged
! 321: * d7: k-factor/Unchanged
! 322: * a0: ptr for original operand/final result
! 323: * a1: x/x
! 324: * a2: x/x
! 325: * fp0: float(ILOG)/Unchanged
! 326: * fp1: x/x
! 327: * fp2: x/x
! 328: * F_SCR1:x/x
! 329: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
! 330: * L_SCR1:x/x
! 331: * L_SCR2:first word of X packed/Unchanged
! 332:
! 333: A6_str:
! 334: tst.l d7 ;branch on sign of k
! 335: ble.b k_neg ;if k <= 0, LEN = ILOG + 1 - k
! 336: move.l d7,d4 ;if k > 0, LEN = k
! 337: bra.b len_ck ;skip to LEN check
! 338: k_neg:
! 339: move.l d6,d4 ;first load ILOG to d4
! 340: sub.l d7,d4 ;subtract off k
! 341: addq.l #1,d4 ;add in the 1
! 342: len_ck:
! 343: tst.l d4 ;LEN check: branch on sign of LEN
! 344: ble.b LEN_ng ;if neg, set LEN = 1
! 345: cmp.l #17,d4 ;test if LEN > 17
! 346: ble.b A7_str ;if not, forget it
! 347: move.l #17,d4 ;set max LEN = 17
! 348: tst.l d7 ;if negative, never set OPERR
! 349: ble.b A7_str ;if positive, continue
! 350: or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
! 351: bra.b A7_str ;finished here
! 352: LEN_ng:
! 353: moveq.l #1,d4 ;min LEN is 1
! 354:
! 355:
! 356: * A7. Calculate SCALE.
! 357: * SCALE is equal to 10^ISCALE, where ISCALE is the number
! 358: * of decimal places needed to insure LEN integer digits
! 359: * in the output before conversion to bcd. LAMBDA is the sign
! 360: * of ISCALE, used in A9. Fp1 contains 10^^(abs(ISCALE)) using
! 361: * the rounding mode as given in the following table (see
! 362: * Coonen, p. 7.23 as ref.; however, the SCALE variable is
! 363: * of opposite sign in bindec.sa from Coonen).
! 364: *
! 365: * Initial USE
! 366: * FPCR[6:5] LAMBDA SIGN(X) FPCR[6:5]
! 367: * ----------------------------------------------
! 368: * RN 00 0 0 00/0 RN
! 369: * RN 00 0 1 00/0 RN
! 370: * RN 00 1 0 00/0 RN
! 371: * RN 00 1 1 00/0 RN
! 372: * RZ 01 0 0 11/3 RP
! 373: * RZ 01 0 1 11/3 RP
! 374: * RZ 01 1 0 10/2 RM
! 375: * RZ 01 1 1 10/2 RM
! 376: * RM 10 0 0 11/3 RP
! 377: * RM 10 0 1 10/2 RM
! 378: * RM 10 1 0 10/2 RM
! 379: * RM 10 1 1 11/3 RP
! 380: * RP 11 0 0 10/2 RM
! 381: * RP 11 0 1 11/3 RP
! 382: * RP 11 1 0 11/3 RP
! 383: * RP 11 1 1 10/2 RM
! 384: *
! 385: * Register usage:
! 386: * Input/Output
! 387: * d0: exponent/scratch - final is 0
! 388: * d2: x/0 or 24 for A9
! 389: * d3: x/scratch - offset ptr into PTENRM array
! 390: * d4: LEN/Unchanged
! 391: * d5: 0/ICTR:LAMBDA
! 392: * d6: ILOG/ILOG or k if ((k<=0)&(ILOG<k))
! 393: * d7: k-factor/Unchanged
! 394: * a0: ptr for original operand/final result
! 395: * a1: x/ptr to PTENRM array
! 396: * a2: x/x
! 397: * fp0: float(ILOG)/Unchanged
! 398: * fp1: x/10^ISCALE
! 399: * fp2: x/x
! 400: * F_SCR1:x/x
! 401: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
! 402: * L_SCR1:x/x
! 403: * L_SCR2:first word of X packed/Unchanged
! 404:
! 405: A7_str:
! 406: tst.l d7 ;test sign of k
! 407: bgt.b k_pos ;if pos and > 0, skip this
! 408: cmp.l d6,d7 ;test k - ILOG
! 409: blt.b k_pos ;if ILOG >= k, skip this
! 410: move.l d7,d6 ;if ((k<0) & (ILOG < k)) ILOG = k
! 411: k_pos:
! 412: move.l d6,d0 ;calc ILOG + 1 - LEN in d0
! 413: addq.l #1,d0 ;add the 1
! 414: sub.l d4,d0 ;sub off LEN
! 415: swap d5 ;use upper word of d5 for LAMBDA
! 416: clr.w d5 ;set it zero initially
! 417: clr.w d2 ;set up d2 for very small case
! 418: tst.l d0 ;test sign of ISCALE
! 419: bge.b iscale ;if pos, skip next inst
! 420: addq.w #1,d5 ;if neg, set LAMBDA true
! 421: cmp.l #$ffffecd4,d0 ;test iscale <= -4908
! 422: bgt.b no_inf ;if false, skip rest
! 423: addi.l #24,d0 ;add in 24 to iscale
! 424: move.l #24,d2 ;put 24 in d2 for A9
! 425: no_inf:
! 426: neg.l d0 ;and take abs of ISCALE
! 427: iscale:
! 428: fmove.s FONE,fp1 ;init fp1 to 1
! 429: bfextu USER_FPCR(a6){26:2},d1 ;get initial rmode bits
! 430: add.w d1,d1 ;put them in bits 2:1
! 431: add.w d5,d1 ;add in LAMBDA
! 432: add.w d1,d1 ;put them in bits 3:1
! 433: tst.l L_SCR2(a6) ;test sign of original x
! 434: bge.b x_pos ;if pos, don't set bit 0
! 435: addq.l #1,d1 ;if neg, set bit 0
! 436: x_pos:
! 437: lea.l RBDTBL,a2 ;load rbdtbl base
! 438: move.b (a2,d1),d3 ;load d3 with new rmode
! 439: lsl.l #4,d3 ;put bits in proper position
! 440: fmove.l d3,fpcr ;load bits into fpu
! 441: lsr.l #4,d3 ;put bits in proper position
! 442: tst.b d3 ;decode new rmode for pten table
! 443: bne.b not_rn ;if zero, it is RN
! 444: lea.l PTENRN,a1 ;load a1 with RN table base
! 445: bra.b rmode ;exit decode
! 446: not_rn:
! 447: lsr.b #1,d3 ;get lsb in carry
! 448: bcc.b not_rp ;if carry clear, it is RM
! 449: lea.l PTENRP,a1 ;load a1 with RP table base
! 450: bra.b rmode ;exit decode
! 451: not_rp:
! 452: lea.l PTENRM,a1 ;load a1 with RM table base
! 453: rmode:
! 454: clr.l d3 ;clr table index
! 455: e_loop:
! 456: lsr.l #1,d0 ;shift next bit into carry
! 457: bcc.b e_next ;if zero, skip the mul
! 458: fmul.x (a1,d3),fp1 ;mul by 10**(d3_bit_no)
! 459: e_next:
! 460: add.l #12,d3 ;inc d3 to next pwrten table entry
! 461: tst.l d0 ;test if ISCALE is zero
! 462: bne.b e_loop ;if not, loop
! 463:
! 464:
! 465: * A8. Clr INEX; Force RZ.
! 466: * The operation in A3 above may have set INEX2.
! 467: * RZ mode is forced for the scaling operation to insure
! 468: * only one rounding error. The grs bits are collected in
! 469: * the INEX flag for use in A10.
! 470: *
! 471: * Register usage:
! 472: * Input/Output
! 473:
! 474: fmove.l #0,FPSR ;clr INEX
! 475: fmove.l #rz_mode,FPCR ;set RZ rounding mode
! 476:
! 477:
! 478: * A9. Scale X -> Y.
! 479: * The mantissa is scaled to the desired number of significant
! 480: * digits. The excess digits are collected in INEX2. If mul,
! 481: * Check d2 for excess 10 exponential value. If not zero,
! 482: * the iscale value would have caused the pwrten calculation
! 483: * to overflow. Only a negative iscale can cause this, so
! 484: * multiply by 10^(d2), which is now only allowed to be 24,
! 485: * with a multiply by 10^8 and 10^16, which is exact since
! 486: * 10^24 is exact. If the input was denormalized, we must
! 487: * create a busy stack frame with the mul command and the
! 488: * two operands, and allow the fpu to complete the multiply.
! 489: *
! 490: * Register usage:
! 491: * Input/Output
! 492: * d0: FPCR with RZ mode/Unchanged
! 493: * d2: 0 or 24/unchanged
! 494: * d3: x/x
! 495: * d4: LEN/Unchanged
! 496: * d5: ICTR:LAMBDA
! 497: * d6: ILOG/Unchanged
! 498: * d7: k-factor/Unchanged
! 499: * a0: ptr for original operand/final result
! 500: * a1: ptr to PTENRM array/Unchanged
! 501: * a2: x/x
! 502: * fp0: float(ILOG)/X adjusted for SCALE (Y)
! 503: * fp1: 10^ISCALE/Unchanged
! 504: * fp2: x/x
! 505: * F_SCR1:x/x
! 506: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
! 507: * L_SCR1:x/x
! 508: * L_SCR2:first word of X packed/Unchanged
! 509:
! 510: A9_str:
! 511: fmove.x (a0),fp0 ;load X from memory
! 512: fabs.x fp0 ;use abs(X)
! 513: tst.w d5 ;LAMBDA is in lower word of d5
! 514: bne.b sc_mul ;if neg (LAMBDA = 1), scale by mul
! 515: fdiv.x fp1,fp0 ;calculate X / SCALE -> Y to fp0
! 516: bra.b A10_st ;branch to A10
! 517:
! 518: sc_mul:
! 519: tst.b BINDEC_FLG(a6) ;check for denorm
! 520: beq.b A9_norm ;if norm, continue with mul
! 521: fmovem.x fp1,-(a7) ;load ETEMP with 10^ISCALE
! 522: move.l 8(a0),-(a7) ;load FPTEMP with input arg
! 523: move.l 4(a0),-(a7)
! 524: move.l (a0),-(a7)
! 525: move.l #18,d3 ;load count for busy stack
! 526: A9_loop:
! 527: clr.l -(a7) ;clear lword on stack
! 528: dbf.w d3,A9_loop
! 529: move.b VER_TMP(a6),(a7) ;write current version number
! 530: move.b #BUSY_SIZE-4,1(a7) ;write current busy size
! 531: move.b #$10,$44(a7) ;set fcefpte[15] bit
! 532: move.w #$0023,$40(a7) ;load cmdreg1b with mul command
! 533: move.b #$fe,$8(a7) ;load all 1s to cu savepc
! 534: frestore (a7)+ ;restore frame to fpu for completion
! 535: fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
! 536: fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
! 537: bra.b A10_st
! 538: A9_norm:
! 539: tst.w d2 ;test for small exp case
! 540: beq.b A9_con ;if zero, continue as normal
! 541: fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
! 542: fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
! 543: A9_con:
! 544: fmul.x fp1,fp0 ;calculate X * SCALE -> Y to fp0
! 545:
! 546:
! 547: * A10. Or in INEX.
! 548: * If INEX is set, round error occurred. This is compensated
! 549: * for by 'or-ing' in the INEX2 flag to the lsb of Y.
! 550: *
! 551: * Register usage:
! 552: * Input/Output
! 553: * d0: FPCR with RZ mode/FPSR with INEX2 isolated
! 554: * d2: x/x
! 555: * d3: x/x
! 556: * d4: LEN/Unchanged
! 557: * d5: ICTR:LAMBDA
! 558: * d6: ILOG/Unchanged
! 559: * d7: k-factor/Unchanged
! 560: * a0: ptr for original operand/final result
! 561: * a1: ptr to PTENxx array/Unchanged
! 562: * a2: x/ptr to FP_SCR2(a6)
! 563: * fp0: Y/Y with lsb adjusted
! 564: * fp1: 10^ISCALE/Unchanged
! 565: * fp2: x/x
! 566:
! 567: A10_st:
! 568: fmove.l FPSR,d0 ;get FPSR
! 569: fmove.x fp0,FP_SCR2(a6) ;move Y to memory
! 570: lea.l FP_SCR2(a6),a2 ;load a2 with ptr to FP_SCR2
! 571: btst.l #9,d0 ;check if INEX2 set
! 572: beq.b A11_st ;if clear, skip rest
! 573: ori.l #1,8(a2) ;or in 1 to lsb of mantissa
! 574: fmove.x FP_SCR2(a6),fp0 ;write adjusted Y back to fpu
! 575:
! 576:
! 577: * A11. Restore original FPCR; set size ext.
! 578: * Perform FINT operation in the user's rounding mode. Keep
! 579: * the size to extended. The sintdo entry point in the sint
! 580: * routine expects the FPCR value to be in USER_FPCR for
! 581: * mode and precision. The original FPCR is saved in L_SCR1.
! 582:
! 583: A11_st:
! 584: move.l USER_FPCR(a6),L_SCR1(a6) ;save it for later
! 585: andi.l #$00000030,USER_FPCR(a6) ;set size to ext,
! 586: * ;block exceptions
! 587:
! 588:
! 589: * A12. Calculate YINT = FINT(Y) according to user's rounding mode.
! 590: * The FPSP routine sintd0 is used. The output is in fp0.
! 591: *
! 592: * Register usage:
! 593: * Input/Output
! 594: * d0: FPSR with AINEX cleared/FPCR with size set to ext
! 595: * d2: x/x/scratch
! 596: * d3: x/x
! 597: * d4: LEN/Unchanged
! 598: * d5: ICTR:LAMBDA/Unchanged
! 599: * d6: ILOG/Unchanged
! 600: * d7: k-factor/Unchanged
! 601: * a0: ptr for original operand/src ptr for sintdo
! 602: * a1: ptr to PTENxx array/Unchanged
! 603: * a2: ptr to FP_SCR2(a6)/Unchanged
! 604: * a6: temp pointer to FP_SCR2(a6) - orig value saved and restored
! 605: * fp0: Y/YINT
! 606: * fp1: 10^ISCALE/Unchanged
! 607: * fp2: x/x
! 608: * F_SCR1:x/x
! 609: * F_SCR2:Y adjusted for inex/Y with original exponent
! 610: * L_SCR1:x/original USER_FPCR
! 611: * L_SCR2:first word of X packed/Unchanged
! 612:
! 613: A12_st:
! 614: movem.l d0-d1/a0-a1,-(a7) ;save regs used by sintd0
! 615: move.l L_SCR1(a6),-(a7)
! 616: move.l L_SCR2(a6),-(a7)
! 617: lea.l FP_SCR2(a6),a0 ;a0 is ptr to F_SCR2(a6)
! 618: fmove.x fp0,(a0) ;move Y to memory at FP_SCR2(a6)
! 619: tst.l L_SCR2(a6) ;test sign of original operand
! 620: bge.b do_fint ;if pos, use Y
! 621: or.l #$80000000,(a0) ;if neg, use -Y
! 622: do_fint:
! 623: move.l USER_FPSR(a6),-(a7)
! 624: bsr sintdo ;sint routine returns int in fp0
! 625: move.b (a7),USER_FPSR(a6)
! 626: add.l #4,a7
! 627: move.l (a7)+,L_SCR2(a6)
! 628: move.l (a7)+,L_SCR1(a6)
! 629: movem.l (a7)+,d0-d1/a0-a1 ;restore regs used by sint
! 630: move.l L_SCR2(a6),FP_SCR2(a6) ;restore original exponent
! 631: move.l L_SCR1(a6),USER_FPCR(a6) ;restore user's FPCR
! 632:
! 633:
! 634: * A13. Check for LEN digits.
! 635: * If the int operation results in more than LEN digits,
! 636: * or less than LEN -1 digits, adjust ILOG and repeat from
! 637: * A6. This test occurs only on the first pass. If the
! 638: * result is exactly 10^LEN, decrement ILOG and divide
! 639: * the mantissa by 10. The calculation of 10^LEN cannot
! 640: * be inexact, since all powers of ten upto 10^27 are exact
! 641: * in extended precision, so the use of a previous power-of-ten
! 642: * table will introduce no error.
! 643: *
! 644: *
! 645: * Register usage:
! 646: * Input/Output
! 647: * d0: FPCR with size set to ext/scratch final = 0
! 648: * d2: x/x
! 649: * d3: x/scratch final = x
! 650: * d4: LEN/LEN adjusted
! 651: * d5: ICTR:LAMBDA/LAMBDA:ICTR
! 652: * d6: ILOG/ILOG adjusted
! 653: * d7: k-factor/Unchanged
! 654: * a0: pointer into memory for packed bcd string formation
! 655: * a1: ptr to PTENxx array/Unchanged
! 656: * a2: ptr to FP_SCR2(a6)/Unchanged
! 657: * fp0: int portion of Y/abs(YINT) adjusted
! 658: * fp1: 10^ISCALE/Unchanged
! 659: * fp2: x/10^LEN
! 660: * F_SCR1:x/x
! 661: * F_SCR2:Y with original exponent/Unchanged
! 662: * L_SCR1:original USER_FPCR/Unchanged
! 663: * L_SCR2:first word of X packed/Unchanged
! 664:
! 665: A13_st:
! 666: swap d5 ;put ICTR in lower word of d5
! 667: tst.w d5 ;check if ICTR = 0
! 668: bne not_zr ;if non-zero, go to second test
! 669: *
! 670: * Compute 10^(LEN-1)
! 671: *
! 672: fmove.s FONE,fp2 ;init fp2 to 1.0
! 673: move.l d4,d0 ;put LEN in d0
! 674: subq.l #1,d0 ;d0 = LEN -1
! 675: clr.l d3 ;clr table index
! 676: l_loop:
! 677: lsr.l #1,d0 ;shift next bit into carry
! 678: bcc.b l_next ;if zero, skip the mul
! 679: fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
! 680: l_next:
! 681: add.l #12,d3 ;inc d3 to next pwrten table entry
! 682: tst.l d0 ;test if LEN is zero
! 683: bne.b l_loop ;if not, loop
! 684: *
! 685: * 10^LEN-1 is computed for this test and A14. If the input was
! 686: * denormalized, check only the case in which YINT > 10^LEN.
! 687: *
! 688: tst.b BINDEC_FLG(a6) ;check if input was norm
! 689: beq.b A13_con ;if norm, continue with checking
! 690: fabs.x fp0 ;take abs of YINT
! 691: bra test_2
! 692: *
! 693: * Compare abs(YINT) to 10^(LEN-1) and 10^LEN
! 694: *
! 695: A13_con:
! 696: fabs.x fp0 ;take abs of YINT
! 697: fcmp.x fp2,fp0 ;compare abs(YINT) with 10^(LEN-1)
! 698: fbge.w test_2 ;if greater, do next test
! 699: subq.l #1,d6 ;subtract 1 from ILOG
! 700: move.w #1,d5 ;set ICTR
! 701: fmove.l #rm_mode,FPCR ;set rmode to RM
! 702: fmul.s FTEN,fp2 ;compute 10^LEN
! 703: bra.w A6_str ;return to A6 and recompute YINT
! 704: test_2:
! 705: fmul.s FTEN,fp2 ;compute 10^LEN
! 706: fcmp.x fp2,fp0 ;compare abs(YINT) with 10^LEN
! 707: fblt.w A14_st ;if less, all is ok, go to A14
! 708: fbgt.w fix_ex ;if greater, fix and redo
! 709: fdiv.s FTEN,fp0 ;if equal, divide by 10
! 710: addq.l #1,d6 ; and inc ILOG
! 711: bra.b A14_st ; and continue elsewhere
! 712: fix_ex:
! 713: addq.l #1,d6 ;increment ILOG by 1
! 714: move.w #1,d5 ;set ICTR
! 715: fmove.l #rm_mode,FPCR ;set rmode to RM
! 716: bra.w A6_str ;return to A6 and recompute YINT
! 717: *
! 718: * Since ICTR <> 0, we have already been through one adjustment,
! 719: * and shouldn't have another; this is to check if abs(YINT) = 10^LEN
! 720: * 10^LEN is again computed using whatever table is in a1 since the
! 721: * value calculated cannot be inexact.
! 722: *
! 723: not_zr:
! 724: fmove.s FONE,fp2 ;init fp2 to 1.0
! 725: move.l d4,d0 ;put LEN in d0
! 726: clr.l d3 ;clr table index
! 727: z_loop:
! 728: lsr.l #1,d0 ;shift next bit into carry
! 729: bcc.b z_next ;if zero, skip the mul
! 730: fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
! 731: z_next:
! 732: add.l #12,d3 ;inc d3 to next pwrten table entry
! 733: tst.l d0 ;test if LEN is zero
! 734: bne.b z_loop ;if not, loop
! 735: fabs.x fp0 ;get abs(YINT)
! 736: fcmp.x fp2,fp0 ;check if abs(YINT) = 10^LEN
! 737: fbne.w A14_st ;if not, skip this
! 738: fdiv.s FTEN,fp0 ;divide abs(YINT) by 10
! 739: addq.l #1,d6 ;and inc ILOG by 1
! 740: addq.l #1,d4 ; and inc LEN
! 741: fmul.s FTEN,fp2 ; if LEN++, the get 10^^LEN
! 742:
! 743:
! 744: * A14. Convert the mantissa to bcd.
! 745: * The binstr routine is used to convert the LEN digit
! 746: * mantissa to bcd in memory. The input to binstr is
! 747: * to be a fraction; i.e. (mantissa)/10^LEN and adjusted
! 748: * such that the decimal point is to the left of bit 63.
! 749: * The bcd digits are stored in the correct position in
! 750: * the final string area in memory.
! 751: *
! 752: *
! 753: * Register usage:
! 754: * Input/Output
! 755: * d0: x/LEN call to binstr - final is 0
! 756: * d1: x/0
! 757: * d2: x/ms 32-bits of mant of abs(YINT)
! 758: * d3: x/ls 32-bits of mant of abs(YINT)
! 759: * d4: LEN/Unchanged
! 760: * d5: ICTR:LAMBDA/LAMBDA:ICTR
! 761: * d6: ILOG
! 762: * d7: k-factor/Unchanged
! 763: * a0: pointer into memory for packed bcd string formation
! 764: * /ptr to first mantissa byte in result string
! 765: * a1: ptr to PTENxx array/Unchanged
! 766: * a2: ptr to FP_SCR2(a6)/Unchanged
! 767: * fp0: int portion of Y/abs(YINT) adjusted
! 768: * fp1: 10^ISCALE/Unchanged
! 769: * fp2: 10^LEN/Unchanged
! 770: * F_SCR1:x/Work area for final result
! 771: * F_SCR2:Y with original exponent/Unchanged
! 772: * L_SCR1:original USER_FPCR/Unchanged
! 773: * L_SCR2:first word of X packed/Unchanged
! 774:
! 775: A14_st:
! 776: fmove.l #rz_mode,FPCR ;force rz for conversion
! 777: fdiv.x fp2,fp0 ;divide abs(YINT) by 10^LEN
! 778: lea.l FP_SCR1(a6),a0
! 779: fmove.x fp0,(a0) ;move abs(YINT)/10^LEN to memory
! 780: move.l 4(a0),d2 ;move 2nd word of FP_RES to d2
! 781: move.l 8(a0),d3 ;move 3rd word of FP_RES to d3
! 782: clr.l 4(a0) ;zero word 2 of FP_RES
! 783: clr.l 8(a0) ;zero word 3 of FP_RES
! 784: move.l (a0),d0 ;move exponent to d0
! 785: swap d0 ;put exponent in lower word
! 786: beq.b no_sft ;if zero, don't shift
! 787: subi.l #$3ffd,d0 ;sub bias less 2 to make fract
! 788: tst.l d0 ;check if > 1
! 789: bgt.b no_sft ;if so, don't shift
! 790: neg.l d0 ;make exp positive
! 791: m_loop:
! 792: lsr.l #1,d2 ;shift d2:d3 right, add 0s
! 793: roxr.l #1,d3 ;the number of places
! 794: dbf.w d0,m_loop ;given in d0
! 795: no_sft:
! 796: tst.l d2 ;check for mantissa of zero
! 797: bne.b no_zr ;if not, go on
! 798: tst.l d3 ;continue zero check
! 799: beq.b zer_m ;if zero, go directly to binstr
! 800: no_zr:
! 801: clr.l d1 ;put zero in d1 for addx
! 802: addi.l #$00000080,d3 ;inc at bit 7
! 803: addx.l d1,d2 ;continue inc
! 804: andi.l #$ffffff80,d3 ;strip off lsb not used by 882
! 805: zer_m:
! 806: move.l d4,d0 ;put LEN in d0 for binstr call
! 807: addq.l #3,a0 ;a0 points to M16 byte in result
! 808: bsr binstr ;call binstr to convert mant
! 809:
! 810:
! 811: * A15. Convert the exponent to bcd.
! 812: * As in A14 above, the exp is converted to bcd and the
! 813: * digits are stored in the final string.
! 814: *
! 815: * Digits are stored in L_SCR1(a6) on return from BINDEC as:
! 816: *
! 817: * 32 16 15 0
! 818: * -----------------------------------------
! 819: * | 0 | e3 | e2 | e1 | e4 | X | X | X |
! 820: * -----------------------------------------
! 821: *
! 822: * And are moved into their proper places in FP_SCR1. If digit e4
! 823: * is non-zero, OPERR is signaled. In all cases, all 4 digits are
! 824: * written as specified in the 881/882 manual for packed decimal.
! 825: *
! 826: * Register usage:
! 827: * Input/Output
! 828: * d0: x/LEN call to binstr - final is 0
! 829: * d1: x/scratch (0);shift count for final exponent packing
! 830: * d2: x/ms 32-bits of exp fraction/scratch
! 831: * d3: x/ls 32-bits of exp fraction
! 832: * d4: LEN/Unchanged
! 833: * d5: ICTR:LAMBDA/LAMBDA:ICTR
! 834: * d6: ILOG
! 835: * d7: k-factor/Unchanged
! 836: * a0: ptr to result string/ptr to L_SCR1(a6)
! 837: * a1: ptr to PTENxx array/Unchanged
! 838: * a2: ptr to FP_SCR2(a6)/Unchanged
! 839: * fp0: abs(YINT) adjusted/float(ILOG)
! 840: * fp1: 10^ISCALE/Unchanged
! 841: * fp2: 10^LEN/Unchanged
! 842: * F_SCR1:Work area for final result/BCD result
! 843: * F_SCR2:Y with original exponent/ILOG/10^4
! 844: * L_SCR1:original USER_FPCR/Exponent digits on return from binstr
! 845: * L_SCR2:first word of X packed/Unchanged
! 846:
! 847: A15_st:
! 848: tst.b BINDEC_FLG(a6) ;check for denorm
! 849: beq.b not_denorm
! 850: ftst.x fp0 ;test for zero
! 851: fbeq.w den_zero ;if zero, use k-factor or 4933
! 852: fmove.l d6,fp0 ;float ILOG
! 853: fabs.x fp0 ;get abs of ILOG
! 854: bra.b convrt
! 855: den_zero:
! 856: tst.l d7 ;check sign of the k-factor
! 857: blt.b use_ilog ;if negative, use ILOG
! 858: fmove.s F4933,fp0 ;force exponent to 4933
! 859: bra.b convrt ;do it
! 860: use_ilog:
! 861: fmove.l d6,fp0 ;float ILOG
! 862: fabs.x fp0 ;get abs of ILOG
! 863: bra.b convrt
! 864: not_denorm:
! 865: ftst.x fp0 ;test for zero
! 866: fbne.w not_zero ;if zero, force exponent
! 867: fmove.s FONE,fp0 ;force exponent to 1
! 868: bra.b convrt ;do it
! 869: not_zero:
! 870: fmove.l d6,fp0 ;float ILOG
! 871: fabs.x fp0 ;get abs of ILOG
! 872: convrt:
! 873: fdiv.x 24(a1),fp0 ;compute ILOG/10^4
! 874: fmove.x fp0,FP_SCR2(a6) ;store fp0 in memory
! 875: move.l 4(a2),d2 ;move word 2 to d2
! 876: move.l 8(a2),d3 ;move word 3 to d3
! 877: move.w (a2),d0 ;move exp to d0
! 878: beq.b x_loop_fin ;if zero, skip the shift
! 879: subi.w #$3ffd,d0 ;subtract off bias
! 880: neg.w d0 ;make exp positive
! 881: x_loop:
! 882: lsr.l #1,d2 ;shift d2:d3 right
! 883: roxr.l #1,d3 ;the number of places
! 884: dbf.w d0,x_loop ;given in d0
! 885: x_loop_fin:
! 886: clr.l d1 ;put zero in d1 for addx
! 887: addi.l #$00000080,d3 ;inc at bit 6
! 888: addx.l d1,d2 ;continue inc
! 889: andi.l #$ffffff80,d3 ;strip off lsb not used by 882
! 890: move.l #4,d0 ;put 4 in d0 for binstr call
! 891: lea.l L_SCR1(a6),a0 ;a0 is ptr to L_SCR1 for exp digits
! 892: bsr binstr ;call binstr to convert exp
! 893: move.l L_SCR1(a6),d0 ;load L_SCR1 lword to d0
! 894: move.l #12,d1 ;use d1 for shift count
! 895: lsr.l d1,d0 ;shift d0 right by 12
! 896: bfins d0,FP_SCR1(a6){4:12} ;put e3:e2:e1 in FP_SCR1
! 897: lsr.l d1,d0 ;shift d0 right by 12
! 898: bfins d0,FP_SCR1(a6){16:4} ;put e4 in FP_SCR1
! 899: tst.b d0 ;check if e4 is zero
! 900: beq.b A16_st ;if zero, skip rest
! 901: or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
! 902:
! 903:
! 904: * A16. Write sign bits to final string.
! 905: * Sigma is bit 31 of initial value; RHO is bit 31 of d6 (ILOG).
! 906: *
! 907: * Register usage:
! 908: * Input/Output
! 909: * d0: x/scratch - final is x
! 910: * d2: x/x
! 911: * d3: x/x
! 912: * d4: LEN/Unchanged
! 913: * d5: ICTR:LAMBDA/LAMBDA:ICTR
! 914: * d6: ILOG/ILOG adjusted
! 915: * d7: k-factor/Unchanged
! 916: * a0: ptr to L_SCR1(a6)/Unchanged
! 917: * a1: ptr to PTENxx array/Unchanged
! 918: * a2: ptr to FP_SCR2(a6)/Unchanged
! 919: * fp0: float(ILOG)/Unchanged
! 920: * fp1: 10^ISCALE/Unchanged
! 921: * fp2: 10^LEN/Unchanged
! 922: * F_SCR1:BCD result with correct signs
! 923: * F_SCR2:ILOG/10^4
! 924: * L_SCR1:Exponent digits on return from binstr
! 925: * L_SCR2:first word of X packed/Unchanged
! 926:
! 927: A16_st:
! 928: clr.l d0 ;clr d0 for collection of signs
! 929: andi.b #$0f,FP_SCR1(a6) ;clear first nibble of FP_SCR1
! 930: tst.l L_SCR2(a6) ;check sign of original mantissa
! 931: bge.b mant_p ;if pos, don't set SM
! 932: moveq.l #2,d0 ;move 2 in to d0 for SM
! 933: mant_p:
! 934: tst.l d6 ;check sign of ILOG
! 935: bge.b wr_sgn ;if pos, don't set SE
! 936: addq.l #1,d0 ;set bit 0 in d0 for SE
! 937: wr_sgn:
! 938: bfins d0,FP_SCR1(a6){0:2} ;insert SM and SE into FP_SCR1
! 939:
! 940: * Clean up and restore all registers used.
! 941:
! 942: fmove.l #0,FPSR ;clear possible inex2/ainex bits
! 943: fmovem.x (a7)+,fp0-fp2
! 944: movem.l (a7)+,d2-d7/a2
! 945: rts
! 946:
! 947: end
CVSweb