⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 s_erfcl.s

📁 glibc 库, 不仅可以学习使用库函数,还可以学习函数的具体实现,是提高功力的好资料
💻 S
📖 第 1 页 / 共 5 页
字号:
//         ~=~ 2^K * ( T + T*[exp(delta + r) - 1]         )//         ~=~ 2^K * ( T + T*[(exp(delta)-1)  //               + exp(delta)*(exp(r)-1)]   )//             ~=~ 2^K * ( T + T*( W + (1+W)*poly(r) ) )//             ~=~ 2^K * ( Y_hi  +  Y_lo )////   where Y_hi = T  and Y_lo = T*(W + (1+W)*poly(r))////   For exp(X)-1, we have////  exp(X)-1 ~=~ 2^K * ( Y_hi + Y_lo ) - 1//       ~=~ 2^K * ( Y_hi + Y_lo - 2^(-K) )////   and we combine Y_hi + Y_lo - 2^(-N)  into the form of two //   numbers  Y_hi + Y_lo carefully.////   **** Algorithm Details ****////   A careful algorithm must be used to realize the mathematical ideas//   accurately. We describe each of the three cases. We assume SAFE//   is preset to be TRUE.////   Case exp_tiny:////   The important points are to ensure an accurate result under //   different rounding directions and a correct setting of the SAFE //   flag.////   If expm1 is 1, then//      SAFE  := False  ...possibility of underflow//      Scale := 1.0//      Y_hi  := X//      Y_lo  := 2^(-17000)//   Else//      Scale := 1.0//      Y_hi  := 1.0//      Y_lo  := X  ...for different rounding modes//   Endif////   Case exp_small:////   Here we compute a simple polynomial. To exploit parallelism, we split//   the polynomial into several portions.////   Let r = X ////   If exp     ...i.e. exp( argument )////      rsq := r * r; //      r4  := rsq*rsq//      poly_lo := P_3 + r*(P_4 + r*(P_5 + r*P_6))//      poly_hi := r + rsq*(P_1 + r*P_2)//      Y_lo    := poly_hi + r4 * poly_lo//      Y_hi    := 1.0//      Scale   := 1.0////   Else           ...i.e. exp( argument ) - 1////      rsq := r * r//      r4  := rsq * rsq//      r6  := rsq * r4//      poly_lo := r6*(Q_5 + r*(Q_6 + r*Q_7))//      poly_hi := Q_1 + r*(Q_2 + r*(Q_3 + r*Q_4))//      Y_lo    := rsq*poly_hi +  poly_lo//      Y_hi    := X//      Scale   := 1.0////   Endif////  Case exp_regular:////  The previous description contain enough information except the//  computation of poly and the final Y_hi and Y_lo in the case for//  exp(X)-1.////  The computation of poly for Step 2:////   rsq := r*r//   poly := r + rsq*(A_1 + r*(A_2 + r*A_3))////  For the case exp(X) - 1, we need to incorporate 2^(-K) into//  Y_hi and Y_lo at the end of Step 4.////   If K > 10 then//      Y_lo := Y_lo - 2^(-K)//   Else//      If K < -10 then//   Y_lo := Y_hi + Y_lo//   Y_hi := -2^(-K)//      Else//   Y_hi := Y_hi - 2^(-K)//      End If//   End If//// Overview of operation//==============================================================// Registers used//==============================================================// Floating Point registers used: // f8, input// f9 -> f14,  f36 -> f126// General registers used: // r32 -> r71 // Predicate registers used:// p6 -> p15// Assembly macros//==============================================================// GR for exp(X)GR_ad_Arg           = r33GR_ad_C             = r34GR_ERFC_S_TB        = r35GR_signexp_x        = r36GR_exp_x            = r36GR_exp_mask         = r37GR_ad_W1            = r38GR_ad_W2            = r39GR_M2               = r40GR_M1               = r41GR_K                = r42GR_exp_2_k          = r43GR_ad_T1            = r44GR_ad_T2            = r45GR_N_fix            = r46GR_ad_P             = r47GR_exp_bias         = r48GR_BIAS             = r48GR_exp_half         = r49GR_sig_inv_ln2      = r50GR_rshf_2to51       = r51GR_exp_2tom51       = r52GR_rshf             = r53// GR for erfcl(x)//==============================================================GR_ERFC_XC_TB       = r54GR_ERFC_P_TB        = r55GR_IndxPlusBias     = r56GR_P_POINT_1        = r57GR_P_POINT_2        = r58GR_AbsArg           = r59GR_ShftXBi          = r60GR_ShftPi           = r61GR_mBIAS            = r62GR_ShftPi_bias      = r63GR_ShftXBi_bias     = r64GR_ShftA14          = r65GR_ShftA15          = r66GR_EpsNorm          = r67GR_0x1              = r68GR_ShftPi_8         = r69GR_26PlusBias       = r70GR_27PlusBias       = r71// GR for __libm_support call//==============================================================GR_SAVE_B0          = r64GR_SAVE_PFS         = r65GR_SAVE_GP          = r66GR_SAVE_SP          = r67GR_Parameter_X      = r68GR_Parameter_Y      = r69GR_Parameter_RESULT = r70GR_Parameter_TAG    = r71//==============================================================// Floating Point Registers//FR_RSHF_2TO51       = f10FR_INV_LN2_2TO63    = f11FR_W_2TO51_RSH      = f12FR_2TOM51           = f13FR_RSHF             = f14FR_scale            = f36FR_float_N          = f37FR_N_signif         = f38FR_L_hi             = f39FR_L_lo             = f40FR_r                = f41FR_W1               = f42FR_T1               = f43FR_W2               = f44FR_T2               = f45FR_rsq              = f46FR_C2               = f47FR_C3               = f48FR_poly             = f49FR_P6               = f49FR_T                = f50FR_P5               = f50FR_P4               = f51FR_W                = f51FR_P3               = f52FR_Wp1              = f52FR_P2               = f53FR_P1               = f54FR_Q7               = f56FR_Q6               = f57FR_Q5               = f58FR_Q4               = f59FR_Q3               = f60FR_Q2               = f61FR_Q1               = f62FR_C1               = f63FR_A15              = f64FR_ch_dx            = f65FR_T_scale          = f66FR_norm_x           = f67FR_AbsArg           = f68FR_POS_ARG_ASYMP    = f69FR_NEG_ARG_ASYMP    = f70FR_Tmp              = f71FR_Xc               = f72FR_A0               = f73FR_A1               = f74FR_A2               = f75FR_A3               = f76FR_A4               = f77FR_A5               = f78FR_A6               = f79FR_A7               = f80FR_A8               = f81FR_A9               = f82FR_A10              = f83FR_A11              = f84FR_A12              = f85FR_A13              = f86FR_A14              = f87FR_P15_0_1          = f88FR_P15_8_1          = f88FR_P15_1_1          = f89FR_P15_8_2          = f89FR_P15_1_2          = f90FR_P15_2_1          = f91FR_P15_2_2          = f92FR_P15_3_1          = f93FR_P15_3_2          = f94FR_P15_4_2          = f95FR_P15_7_1          = f96FR_P15_7_2          = f97FR_P15_9_1          = f98FR_P15_9_2          = f99FR_P15_13_1         = f100FR_P15_14_1         = f101FR_P15_14_2         = f102FR_Tmp2             = f103FR_Xpdx_lo          = f104FR_2                = f105FR_xsq_lo           = f106FR_LocArg           = f107FR_Tmpf             = f108FR_Tmp1             = f109FR_EpsNorm          = f110FR_UnfBound         = f111FR_NormX            = f112FR_Xpdx_hi          = f113FR_dU               = f114FR_H                = f115FR_G                = f116FR_V                = f117FR_M                = f118FR_U                = f119FR_Q                = f120FR_S                = f121FR_R                = f122FR_res_pos_x_hi     = f123FR_res_pos_x_lo     = f124FR_dx               = f125FR_dx1              = f126// for error handler routineFR_X                = f9FR_Y                = f0FR_RESULT           = f8// Data tables

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -