⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 w_tgamma.s

📁 glibc 2.9,最新版的C语言库函数
💻 S
📖 第 1 页 / 共 4 页
字号:
.file "tgamma.s"// Copyright (c) 2001 - 2005, Intel Corporation// All rights reserved.//// Contributed 2001 by the Intel Numerics Group, Intel Corporation//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,INCLUDING,BUT NOT // LIMITED TO,THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS // CONTRIBUTORS BE LIABLE FOR ANY DIRECT,INDIRECT,INCIDENTAL,SPECIAL,// EXEMPLARY,OR CONSEQUENTIAL DAMAGES (INCLUDING,BUT NOT LIMITED TO,// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA,OR // PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY // OF LIABILITY,WHETHER IN CONTRACT,STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // SOFTWARE,EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Intel Corporation is the author of this code,and requests that all// problem reports or change requests be submitted to it directly at // http://www.intel.com/software/products/opensource/libraries/num.htm.////*********************************************************************//// History: // 10/12/01  Initial version// 05/20/02  Cleaned up namespace and sf0 syntax// 02/10/03  Reordered header: .section, .global, .proc, .align// 04/04/03  Changed error codes for overflow and negative integers// 04/10/03  Changed code for overflow near zero handling// 03/31/05  Reformatted delimiters between data tables////*********************************************************************////*********************************************************************//// Function: tgamma(x) computes the principle value of the GAMMA// function of x.////*********************************************************************//// Resources Used:////    Floating-Point Registers: f8-f15//                              f33-f87////    General Purpose Registers://      r8-r11//      r14-r28//      r32-r36//      r37-r40 (Used to pass arguments to error handling routine)////    Predicate Registers:      p6-p15////*********************************************************************//// IEEE Special Conditions:////    tgamma(+inf) = +inf//    tgamma(-inf) = QNaN //    tgamma(+/-0) = +/-inf //    tgamma(x<0, x - integer) = QNaN//    tgamma(SNaN) = QNaN//    tgamma(QNaN) = QNaN////*********************************************************************//// Overview//// The method consists of three cases.// // If       2 <= x < OVERFLOW_BOUNDARY      use case tgamma_regular;// else if    0 < x < 2                     use case tgamma_from_0_to_2;// else    if  -(i+1) <  x < -i, i = 0...184 use case tgamma_negatives;//// Case 2 <= x < OVERFLOW_BOUNDARY// -------------------------------//   Here we use algorithm based on the recursive formula//   GAMMA(x+1) = x*GAMMA(x). For that we subdivide interval//   [2; OVERFLOW_BOUNDARY] into intervals [16*n; 16*(n+1)] and//   approximate GAMMA(x) by polynomial of 22th degree on each//   [16*n; 16*n+1], recursive formula is used to expand GAMMA(x)//   to [16*n; 16*n+1]. In other words we need to find n, i and r//   such that x = 16 * n + i + r where n and i are integer numbers//   and r is fractional part of x. So GAMMA(x) = GAMMA(16*n+i+r) =//   = (x-1)*(x-2)*...*(x-i)*GAMMA(x-i) =//   = (x-1)*(x-2)*...*(x-i)*GAMMA(16*n+r) ~//   ~ (x-1)*(x-2)*...*(x-i)*P22n(r).////   Step 1: Reduction//   -----------------//    N = [x] with truncate//    r = x - N, note 0 <= r < 1////    n = N & ~0xF - index of table that contains coefficient of//                   polynomial approximation //    i = N & 0xF  - is used in recursive formula//   ////   Step 2: Approximation//   ---------------------//    We use factorized minimax approximation polynomials//    P22n(r) = A22*(r^2+C01(n)*R+C00(n))*//              *(r^2+C11(n)*R+C10(n))*...*(r^2+CA1(n)*R+CA0(n))////   Step 3: Recursion//   -----------------//    In case when i > 0 we need to multiply P22n(r) by product//    R(i)=(x-1)*(x-2)*...*(x-i). To reduce number of fp-instructions//    we can calculate R as follow:  //    R(i) = ((x-1)*(x-2))*((x-3)*(x-4))*...*((x-(i-1))*(x-i)) if i is//    even or R = ((x-1)*(x-2))*((x-3)*(x-4))*...*((x-(i-2))*(x-(i-1)))*//    *(i-1) if i is odd. In both cases we need to calculate//    R2(i) = (x^2-3*x+2)*(x^2-7*x+12)*...*(x^2+x+2*j*(2*j-1)) =//    = (x^2-3*x+2)*(x^2-7*x+12)*...*((x^2+x)+2*j*(2*(j-1)+(1-2*x))) =//    = (RA+2*(2-RB))*(RA+4*(4-RB))*...*(RA+2*j*(2*(j-1)+RB))//    where j = 1..[i/2], RA = x^2+x, RB = 1-2*x.////   Step 4: Reconstruction//   ----------------------//    Reconstruction is just simple multiplication i.e.//    GAMMA(x) = P22n(r)*R(i)//// Case 0 < x < 2// --------------//    To calculate GAMMA(x) on this interval we do following//        if 1 <= x < 1.25   than  GAMMA(x) = P15(x-1)//        if 1.25 <= x < 1.5 than  GAMMA(x) = P15(x-x_min) where//        x_min is point of local minimum on [1; 2] interval.//        if 1.5  <= x < 2.0 than  GAMMA(x) = P15(x-1.5)//    and      //        if 0 < x < 1 than GAMMA(x) = GAMMA(x+1)/x//// Case -(i+1) <  x < -i, i = 0...184// ----------------------------------//    Here we use the fact that GAMMA(-x) = PI/(x*GAMMA(x)*sin(PI*x)) and//    so we need to calculate GAMMA(x), sin(PI*x)/PI. Calculation of//    GAMMA(x) is described above.////   Step 1: Reduction//   -----------------//    Note that period of sin(PI*x) is 2 and range reduction for //    sin(PI*x) is like to range reduction for GAMMA(x) //    i.e r = x - [x] with exception of cases//    when r > 0.5 (in such cases r = 1 - (x - [x])).////   Step 2: Approximation//   ---------------------//    To approximate sin(PI*x)/PI = sin(PI*(2*n+r))/PI = //    = (-1)^n*sin(PI*r)/PI Taylor series is used.//    sin(PI*r)/PI ~ S21(r).////   Step 3: Division//   ----------------//    To calculate 1/(x*GAMMA(x)*S21(r)) we use frcpa instruction//    with following Newton-Raphson interations.//  ////*********************************************************************GR_Sig                  = r8GR_TAG                  = r8GR_ad_Data              = r9GR_SigRqLin             = r10GR_iSig                 = r11GR_ExpOf1               = r11GR_ExpOf8               = r11GR_Sig2                 = r14GR_Addr_Mask1           = r15GR_Sign_Exp             = r16GR_Tbl_Offs             = r17GR_Addr_Mask2           = r18GR_ad_Co                = r19GR_Bit2                 = r19GR_ad_Ce                = r20GR_ad_Co7               = r21GR_NzOvfBound           = r21GR_ad_Ce7               = r22GR_Tbl_Ind              = r23GR_Tbl_16xInd           = r24GR_ExpOf025             = r24GR_ExpOf05              = r25GR_0x30033              = r26GR_10                   = r26GR_12                   = r27GR_185                  = r27GR_14                   = r28GR_2                    = r28GR_fpsr                 = r28GR_SAVE_B0              = r33GR_SAVE_PFS             = r34GR_SAVE_GP              = r35GR_SAVE_SP              = r36GR_Parameter_X          = r37GR_Parameter_Y          = r38GR_Parameter_RESULT     = r39GR_Parameter_TAG        = r40FR_X                    = f10FR_Y                    = f1 // tgamma is single argument functionFR_RESULT               = f8FR_AbsX                 = f9FR_NormX                = f9FR_r02                  = f11FR_AbsXp1               = f12FR_X2pX                 = f13FR_1m2X                 = f14FR_Rq1                  = f14FR_Xt                   = f15FR_r                    = f33FR_OvfBound             = f34FR_Xmin                 = f35FR_2                    = f36FR_Rcp1                 = f36FR_Rcp3                 = f36FR_4                    = f37FR_5                    = f38FR_6                    = f39FR_8                    = f40FR_10                   = f41FR_12                   = f42FR_14                   = f43FR_GAMMA                = f43FR_05                   = f44FR_Rq2                  = f45FR_Rq3                  = f46FR_Rq4                  = f47FR_Rq5                  = f48FR_Rq6                  = f49FR_Rq7                  = f50FR_RqLin                = f51FR_InvAn                = f52FR_C01                  = f53FR_A15                  = f53FR_C11                  = f54FR_A14                  = f54FR_C21                  = f55FR_A13                  = f55FR_C31                  = f56FR_A12                  = f56FR_C41                  = f57FR_A11                  = f57FR_C51                  = f58FR_A10                  = f58FR_C61                  = f59FR_A9                   = f59FR_C71                  = f60FR_A8                   = f60FR_C81                  = f61FR_A7                   = f61FR_C91                  = f62FR_A6                   = f62FR_CA1                  = f63FR_A5                   = f63FR_C00                  = f64FR_A4                   = f64FR_rs2                  = f64FR_C10                  = f65FR_A3                   = f65FR_rs3                  = f65FR_C20                  = f66FR_A2                   = f66FR_rs4                  = f66FR_C30                  = f67FR_A1                   = f67FR_rs7                  = f67FR_C40                  = f68FR_A0                   = f68FR_rs8                  = f68FR_C50                  = f69FR_r2                   = f69FR_C60                  = f70FR_r3                   = f70FR_C70                  = f71FR_r4                   = f71FR_C80                  = f72FR_r7                   = f72FR_C90                  = f73FR_r8                   = f73FR_CA0                  = f74FR_An                   = f75FR_S21                  = f76FR_S19                  = f77FR_Rcp0                 = f77FR_Rcp2                 = f77FR_S17                  = f78FR_S15                  = f79FR_S13                  = f80FR_S11                  = f81FR_S9                   = f82FR_S7                   = f83FR_S5                   = f84FR_S3                   = f85FR_iXt                  = f86FR_rs                   = f87// Data tables//==============================================================RODATA.align 16LOCAL_OBJECT_START(tgamma_data)data8 0x406573FAE561F648 // overflow boundary (171.624376956302739927196)data8 0x3FDD8B618D5AF8FE // point of local minium (0.461632144968362356785)////[2; 3]data8 0xEF0E85C9AE40ABE2,0x00004000 // C01data8 0xCA2049DDB4096DD8,0x00004000 // C11data8 0x99A203B4DC2D1A8C,0x00004000 // C21data8 0xBF5D9D9C0C295570,0x00003FFF // C31data8 0xE8DD037DEB833BAB,0x00003FFD // C41data8 0xB6AE39A2A36AA03A,0x0000BFFE // C51data8 0x804960DC2850277B,0x0000C000 // C61data8 0xD9F3973841C09F80,0x0000C000 // C71data8 0x9C198A676F8A2239,0x0000C001 // C81data8 0xC98B7DAE02BE3226,0x0000C001 // C91data8 0xE9CAF31AC69301BA,0x0000C001 // CA1data8 0xFBBDD58608A0D172,0x00004000 // C00data8 0xFDD0316D1E078301,0x00004000 // C10data8 0x8630B760468C15E4,0x00004001 // C20data8 0x93EDE20E47D9152E,0x00004001 // C30data8 0xA86F3A38C77D6B19,0x00004001 // C40//[16; 17]data8 0xF87F757F365EE813,0x00004000 // C01data8 0xECA84FBA92759DA4,0x00004000 // C11data8 0xD4E0A55E07A8E913,0x00004000 // C21data8 0xB0EB45E94C8A5F7B,0x00004000 // C31data8 0x8050D6B4F7C8617D,0x00004000 // C41data8 0x8471B111AA691E5A,0x00003FFF // C51data8 0xADAF462AF96585C9,0x0000BFFC // C61data8 0xD327C7A587A8C32B,0x0000BFFF // C71data8 0xDEF5192B4CF5E0F1,0x0000C000 // C81data8 0xBADD64BB205AEF02,0x0000C001 // C91data8 0x9330A24AA67D6860,0x0000C002 // CA1data8 0xF57EEAF36D8C47BE,0x00004000 // C00data8 0x807092E12A251B38,0x00004001 // C10data8 0x8C458F80DEE7ED1C,0x00004001 // C20data8 0x9F30C731DC77F1A6,0x00004001 // C30data8 0xBAC4E7E099C3A373,0x00004001 // C40//[32; 33]data8 0xC3059A415F142DEF,0x00004000 // C01data8 0xB9C1DAC24664587A,0x00004000 // C11data8 0xA7101D910992FFB2,0x00004000 // C21data8 0x8A9522B8E4AA0AB4,0x00004000 // C31data8 0xC76A271E4BA95DCC,0x00003FFF // C41data8 0xC5D6DE2A38DB7FF2,0x00003FFE // C51data8 0xDBA42086997818B2,0x0000BFFC // C61data8 0xB8EDDB1424C1C996,0x0000BFFF // C71data8 0xBF7372FB45524B5D,0x0000C000 // C81data8 0xA03DDE759131580A,0x0000C001 // C91data8 0xFDA6FC4022C1FFE3,0x0000C001 // CA1data8 0x9759ABF797B2533D,0x00004000 // C00data8 0x9FA160C6CF18CEC5,0x00004000 // C10data8 0xB0EFF1E3530E0FCD,0x00004000 // C20data8 0xCCD60D5C470165D1,0x00004000 // C30data8 0xF5E53F6307B0B1C1,0x00004000 // C40//[48; 49]data8 0xAABE577FBCE37F5E,0x00004000 // C01data8 0xA274CAEEB5DF7172,0x00004000 // C11data8 0x91B90B6646C1B924,0x00004000 // C21data8 0xF06718519CA256D9,0x00003FFF // C31data8 0xAA9EE181C0E30263,0x00003FFF // C41data8 0xA07BDB5325CB28D2,0x00003FFE // C51data8 0x86C8B873204F9219,0x0000BFFD // C61data8 0xB0192C5D3E4787D6,0x0000BFFF // C71data8 0xB1E0A6263D4C19EF,0x0000C000 // C81data8 0x93BA32A118EAC9AE,0x0000C001 // C91data8 0xE942A39CD9BEE887,0x0000C001 // CA1data8 0xE838B0957B0D3D0D,0x00003FFF // C00data8 0xF60E0F00074FCF34,0x00003FFF // C10data8 0x89869936AE00C2A5,0x00004000 // C20data8 0xA0FE4E8AA611207F,0x00004000 // C30data8 0xC3B1229CFF1DDAFE,0x00004000 // C40//[64; 65]data8 0x9C00DDF75CDC6183,0x00004000 // C01data8 0x9446AE9C0F6A833E,0x00004000 // C11data8 0x84ABC5083310B774,0x00004000 // C21data8 0xD9BA3A0977B1ED83,0x00003FFF // C31data8 0x989B18C99411D300,0x00003FFF // C41data8 0x886E66402318CE6F,0x00003FFE // C51data8 0x99028C2468F18F38,0x0000BFFD // C61data8 0xAB72D17DCD40CCE1,0x0000BFFF // C71data8 0xA9D9AC9BE42C2EF9,0x0000C000 // C81data8 0x8C11D983AA177AD2,0x0000C001 // C91data8 0xDC779E981C1F0F06,0x0000C001 // CA1data8 0xC1FD4AC85965E8D6,0x00003FFF // C00data8 0xCE3D2D909D389EC2,0x00003FFF // C10data8 0xE7F79980AD06F5D8,0x00003FFF // C20data8 0x88DD9F73C8680B5D,0x00004000 // C30data8 0xA7D6CB2CB2D46F9D,0x00004000 // C40//[80; 81]data8 0x91C7FF4E993430D0,0x00004000 // C01data8 0x8A6E7AB83E45A7E9,0x00004000 // C11data8 0xF72D6382E427BEA9,0x00003FFF // C21data8 0xC9E2E4F9B3B23ED6,0x00003FFF // C31data8 0x8BEFEF56AE05D775,0x00003FFF // C41data8 0xEE9666AB6A185560,0x00003FFD // C51data8 0xA6AFAF5CEFAEE04D,0x0000BFFD // C61data8 0xA877EAFEF1F9C880,0x0000BFFF // C71data8 0xA45BD433048ECA15,0x0000C000 // C81data8 0x86BD1636B774CC2E,0x0000C001 // C91data8 0xD3721BE006E10823,0x0000C001 // CA1data8 0xA97EFABA91854208,0x00003FFF // C00data8 0xB4AF0AEBB3F97737,0x00003FFF // C10data8 0xCC38241936851B0B,0x00003FFF // C20data8 0xF282A6261006EA84,0x00003FFF // C30data8 0x95B8E9DB1BD45BAF,0x00004000 // C40//[96; 97]data8 0x8A1FA3171B35A106,0x00004000 // C01data8 0x830D5B8843890F21,0x00004000 // C11data8 0xE98B0F1616677A23,0x00003FFF // C21data8 0xBDF8347F5F67D4EC,0x00003FFF // C31data8 0x825F15DE34EC055D,0x00003FFF // C41data8 0xD4846186B8AAC7BE,0x00003FFD // C51data8 0xB161093AB14919B1,0x0000BFFD // C61data8 0xA65758EEA4800EF4,0x0000BFFF // C71data8 0xA046B67536FA329C,0x0000C000 // C81data8 0x82BBEC1BCB9E9068,0x0000C001 // C91data8 0xCC9DE2B23BA91B0B,0x0000C001 // CA1data8 0x983B16148AF77F94,0x00003FFF // C00data8 0xA2A4D8EE90FEE5DD,0x00003FFF // C10data8 0xB89446FA37FF481C,0x00003FFF // C20data8 0xDC5572648485FB01,0x00003FFF // C30data8 0x88CD5D7DB976129A,0x00004000 // C40//[112; 113]data8 0x8417098FD62AC5E3,0x00004000 // C01data8 0xFA7896486B779CBB,0x00003FFF // C11data8 0xDEC98B14AF5EEBD1,0x00003FFF // C21data8 0xB48E153C6BF0B5A3,0x00003FFF // C31

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -