⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 s_tanh.s

📁 glibc 2.9,最新版的C语言库函数
💻 S
📖 第 1 页 / 共 3 页
字号:
.file "tanh.s"// Copyright (c) 2001 - 2005, Intel Corporation// All rights reserved.//// Contributed 2001 by the Intel Numerics Group, Intel Corporation//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS // CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, // PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR // PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Intel Corporation is the author of this code, and requests that all// problem reports or change requests be submitted to it directly at // http://www.intel.com/software/products/opensource/libraries/num.htm.//// History//==============================================================================// 05/30/01  Initial version// 12/04/01  Rewritten version with erf-like algorithm.//           Performance improved.// 05/20/02  Cleaned up namespace and sf0 syntax// 08/14/02  Changed mli templates to mlx// 02/10/03  Reordered header: .section, .global, .proc, .align// 03/31/05  Reformatted delimiters between data tables//// API//==============================================================================// double tanh(double)//// Overview of operation//==============================================================================//// Algorithm description// ---------------------//// There are 4 paths://// 1. Special path: x = 0, Inf, NaNs, denormals//    Return tanh(x) = +/-0.0 for zeros//    Return tanh(x) = QNaN for NaNs//    Return tanh(x) = sign(x)*1.0 for Inf//    Return tanh(x) = x + x^2   for - denormals//    Return tanh(x) = x - x^2   for + denormals//// 2. Near zero path: 0.0 < |x| < 0.25//    Return tanh(x) = x + x^3*A3 + ... + x^19*A19//// 3. Main path: 0.25 <= |x| < 19.0625//    For several ranges of 0.25 <= |x| < 19.0625//    Return tanh(x) = sign(x)*(A0 + y*A1 + y^2*A2 + //                                       + y^3*A3 + ... + y^19*A19)//    where y = (|x|/a) - b//    //    For each range there is particular set of coefficients.//    Below is the list of ranges://    1/4  <= |x| < 1/2     a = 0.25, b = 1.0//    1/2  <= |x| < 1.0     a = 0.5,  b = 1.0//    1.0  <= |x| < 2.0     a = 1.0,  b = 1.0//    2.0  <= |x| < 3.25    a = 2.0,  b = 1.0//    3.25 <= |x| < 4.0     a = 2.0,  b = 2.0//    4.0  <= |x| < 6.5     a = 4.0,  b = 1.0//    6.5  <= |x| < 8.0     a = 4.0,  b = 2.0//    8.0  <= |x| < 13.0    a = 8.0,  b = 1.0//    13.0 <= |x| < 16.0    a = 8.0,  b = 2.0//    16.0 <= |x| < 19.0625 a = 16.0, b = 1.0//    ( [3.25;4.0], [6.5;8.0], [13.0;16.0] subranges separated //                               for monotonicity issues resolve )//// 4. Saturation path: 19.0625 <= |x| < +INF //    Return tanh(x) = sign(x)*(1.0 - tiny_value)//    (tiny_value ~ 2^(-63))//// Registers used//==============================================================================// Floating Point registers used: // f8 = input, output// f32 -> f64//// General registers used:  // r32 -> r51, r2, r3//// Predicate registers used:// p6, p8, p10, p11, p12, p14, p15// p6           arg is zero, denormal or special IEEE// p8           to filter out case when signd(x) > 1.625 // p10          to filter out case when |x| < 0.25// p11          to filter out case when signd(x) <= 1.625 // p12          to filter out case when |x| >= 19.0625// p14          set to 1 for positive x// p15          set to 1 for negative x// Assembly macros//==============================================================================rDataPtr           = r2rDataPtr1          = r3rBias              = r33rCoeffAddr3        = r34rThreeAndQ         = r35rCoeffAddr2        = r36rMask              = r37rArg               = r38rSignBit           = r39rAbsArg            = r40rSaturation        = r41rIndex             = r42rCoeffAddr1        = r43rCoeffAddr4        = r44rShiftedArg        = r45rShiftedArgMasked  = r46rBiasedExpOf4      = r47rShiftedAbsArg     = r48rArgSgnd           = r49r1625Sgnd          = r50rTwo               = r51//==============================================================================fA0                = f32fA1                = f33fA2                = f34fA3                = f35fA4                = f36fA5                = f37fA6                = f38fA7                = f39fA8                = f40fA9                = f41fA10               = f42fA11               = f43fA12               = f44fA13               = f45fA14               = f46fA15               = f47fA16               = f48fA17               = f49fA18               = f50fA19               = f51fArgSqr            = f52fArgAbsNorm        = f53fSignumX           = f54fRes               = f55fThreeAndQ         = f56fArgAbs            = f57fTSqr              = f58fTQuadr            = f59fTDeg3             = f60fTDeg7             = f61fArgAbsNormSgn     = f62                          fTQuadrSgn         = f63fTwo               = f64// Data tables//==============================================================================RODATA.align 16LOCAL_OBJECT_START(tanh_data)// CAUTION: The order of these table coefficients shouldn't be changed!// Main path coefficients:// Coefficients ##0..15 ("main" coefficient tables)// Polynomial coefficients for the tanh(x), 0.25 <= |x| < 0.5 data8 0xE9D218BC9A3FB55A, 0x00003FC7 //A19data8 0xC8C0D38687F36EBA, 0x00003FCE //A18data8 0xA2663E519FAC8A43, 0x0000BFD2 //A17data8 0xD913F0490674B0DF, 0x00003FD3 //A16data8 0xF75D84789DE0AE52, 0x00003FD6 //A15data8 0xACB3C40EEF3A06F0, 0x0000BFD9 //A14data8 0xEBD7F5DC02CFD5BA, 0x0000BFDB //A13data8 0x8B52CDF66D709E2A, 0x00003FDF //A12data8 0x9EC21F28E05C4A3E, 0x00003FE0 //A11data8 0xC412B44D0176F3ED, 0x0000BFE4 //A10data8 0x97BF35A34DD1EA4C, 0x0000BFE0 //A9data8 0xF89F5B39E3A3AA36, 0x00003FE9 //A8data8 0xF2BA654BCEEBA433, 0x0000BFEA //A7data8 0x8E1C15876AA589AD, 0x0000BFEF //A6data8 0x942226246A8C2A86, 0x00003FF1 //A5data8 0x8F06D9FF7DB47261, 0x00003FF4 //A4//// Polynomial coefficients for the tanh(x), 0.5 <= |x| < 1.0 data8 0xC4A7B8FB672A8520, 0x00003FDC //A19data8 0xA20724B847E13499, 0x0000BFE0 //A18data8 0xE17DB53F02E4D340, 0x00003FE2 //A17data8 0x90264A1012F4CA6F, 0x0000BFE4 //A16data8 0xEBEC9F776F0BF415, 0x0000BFE0 //A15data8 0x89AF912B305B45A4, 0x00003FE7 //A14data8 0xB4A960B81F5EC36A, 0x0000BFE7 //A13data8 0x969A4E95B2DA86B5, 0x0000BFEA //A12data8 0x8A3FC0EC082305CB, 0x00003FEC //A11data8 0x83D7795BCBE24373, 0x00003FEC //A10data8 0xDCBF42AEB82932EC, 0x0000BFEF //A9data8 0x83318E61ECAFD804, 0x00003FF0 //A8data8 0xEA4DE5746975A914, 0x00003FF2 //A7data8 0xCE63E8FA6B96480B, 0x0000BFF4 //A6data8 0xDF017BE0D4FE45D8, 0x0000BFF4 //A5data8 0xA8A0C6E2226DF3CD, 0x00003FF8 //A4//// Polynomial coefficients for the tanh(x), 1.0 <= |x| < 2.0 data8 0x8E89D2EBFDAA160B, 0x00003FE9 //A19data8 0xDD9226310A272046, 0x0000BFEC //A18data8 0xA038042D28B0D665, 0x00003FEF //A17data8 0x8C04796F03516306, 0x0000BFF1 //A16data8 0x9CD6A9CB4E90A2FD, 0x00003FF2 //A15data8 0xC8980E166F5A84FD, 0x0000BFF2 //A14data8 0x9ADFE65F56B7BCFD, 0x00003FED //A13data8 0x8B11FDFB5D0A7B96, 0x00003FF4 //A12data8 0x8209A125E829CBFA, 0x0000BFF5 //A11data8 0xCF38AAC17B85BD76, 0x00003FF1 //A10data8 0xD5C2E248D8AB99AB, 0x00003FF6 //A9data8 0xE12BE2785727F2D6, 0x0000BFF7 //A8data8 0x9FC9EF90F87BF1E2, 0x00003FF6 //A7data8 0x9B02FE0DAF42C08F, 0x00003FF9 //A6data8 0xBDACE06F531D9491, 0x0000BFFA //A5data8 0xE3048AD1DB2F648C, 0x00003FF9 //A4//// Polynomial coefficients for the tanh(x), 2.0 <= |x| < 3.25 data8 0x856EC3B0330A385A, 0x00003FEB //A19data8 0xC641D69DAE2D429C, 0x0000BFF2 //A18data8 0xC683EB0BE1343FFF, 0x00003FF5 //A17data8 0xC358954224E4E823, 0x0000BFF7 //A16data8 0xF813A8D6D396BC5F, 0x00003FF8 //A15data8 0xE0ECDFED078D37D6, 0x0000BFF9 //A14data8 0x950E4E619855E316, 0x00003FFA //A13data8 0x8453B8F93370FB58, 0x0000BFFA //A12data8 0xFDBA28430AEC95BA, 0x00003FF7 //A11data8 0x9371AAC1FDB1E664, 0x00003FFA //A10data8 0xAC972DA97782D88A, 0x0000BFFB //A9data8 0xE18F47B10B9CE1BC, 0x00003FFB //A8data8 0xAB7C81230BF13BC6, 0x0000BFFB //A7data8 0xA6CAAD4A3E31A7D5, 0x0000BFF8 //A6data8 0x9CABD76D1D5C3878, 0x00003FFC //A5data8 0x92906D077941CAA9, 0x0000BFFD //A4//// Polynomial coefficients for the tanh(x), 4.0 <= |x| < 6.5 data8 0x9232D19F71709AC9, 0x0000BFF5 //A19data8 0x819E31323F5DD3F8, 0x00003FF8 //A18data8 0xDA8E1CDB8D23DC29, 0x0000BFF9 //A17data8 0xE97C7CD8FC0486D8, 0x00003FFA //A16data8 0xB0C4AD234D88C9F2, 0x0000BFFB //A15data8 0xC5989BFB28FDE267, 0x00003FFB //A14data8 0x9B26520EC4EFEE8E, 0x0000BFFB //A13data8 0xC4B6F758AD21E574, 0x00003FF9 //A12data8 0xCC36E3FFA10D2CFF, 0x00003FFA //A11data8 0x8738696FB06A5CED, 0x0000BFFC //A10data8 0xD31981825BF39228, 0x00003FFC //A9data8 0x82C58FB9BEE43992, 0x0000BFFD //A8data8 0x88D5AAE49164B6F3, 0x00003FFD //A7data8 0xF4CA0B968AF2DDE2, 0x0000BFFC //A6data8 0xB99874B482BD17EE, 0x00003FFC //A5data8 0xE93FB2F99431DC1D, 0x0000BFFB //A4//// Polynomial coefficients for the tanh(x), 8.0 <= |x| < 13.0 data8 0xAAA9EB7EADA85CEC, 0x00003FF5 //A19data8 0x980C80EE05A6BE78, 0x0000BFF8 //A18data8 0x818DA9F5396390A5, 0x00003FFA //A17data8 0x8D8CC21E23D8A6A2, 0x0000BFFB //A16data8 0xE0EC19E55A886765, 0x00003FFB //A15data8 0x8C11197A7E6244C5, 0x0000BFFC //A14data8 0x901D2BF203C2F7F3, 0x00003FFC //A13data8 0xFEACAEE66EE803E5, 0x0000BFFB //A12data8 0xC684E4925E318C3F, 0x00003FFB //A11data8 0x8A9D8A970565F28D, 0x0000BFFB //A10data8 0xAE34C61DE5CEA4D4, 0x00003FFA //A9data8 0xC44C5714BD6208A0, 0x0000BFF9 //A8data8 0xC4612F7D6C8BDB79, 0x00003FF8 //A7data8 0xABD91DCE40D5EECB, 0x0000BFF7 //A6data8 0x80E375C1B847B72F, 0x00003FF6 //A5data8 0xA11C7DD978CF700A, 0x0000BFF4 //A4//// Polynomial coefficients for the tanh(x), 16.0 <= |x| < 19.0625 data8 0xE29D17C510F86F6B, 0x00003FF3 //A19data8 0x88FE52EB39A3A98C, 0x0000BFF5 //A18data8 0xA406547E50360693, 0x00003FF5 //A17data8 0x83E6260B71C6D7DE, 0x0000BFF5 //A16data8 0xA36AB5B0CBC97B85, 0x00003FF4 //A15data8 0xA94931E0B7BA6C14, 0x0000BFF3 //A14data8 0x9A4596DAF350AD63, 0x00003FF2 //A13data8 0xFE47643F375AECA5, 0x0000BFF0 //A12data8 0xBF8433C5ABEE63B1, 0x00003FEF //A11data8 0x83CEE05D7AE90A0A, 0x0000BFEE //A10data8 0xA4CC45480BCEB02D, 0x00003FEC //A9data8 0xB967CBDCBC16CB10, 0x0000BFEA //A8data8 0xB9681B214EDC098D, 0x00003FE8 //A7data8 0xA23B20D87B80DFA8, 0x0000BFE6 //A6data8 0xF358B2C46F10CBAF, 0x00003FE3 //A5data8 0x98176FD06229A385, 0x0000BFE1 //A4//// Binary subranges// Polynomial coefficients for the tanh(x), 3.25 <= |x| < 4.0 data8 0xEF2EE841288F6706, 0x00003FE9 //A19data8 0xE65D5B74B85F82A6, 0x00003FEB //A18data8 0xE495FC21E42A79FF, 0x00003FEA //A17data8 0xF99B267A913CF3E5, 0x00003FEC //A16data8 0xFE3D700F4A0A0FDE, 0x0000BFEC //A15data8 0x8F91BB4EE4E4EA52, 0x00003FEE //A14data8 0xBCA9F41A5C6EF8BA, 0x0000BFEE //A13data8 0xF93E00884027A9CF, 0x00003FED //A12data8 0xC4D4036A61BABC2F, 0x00003FEF //A11data8 0x86CC2AD1AD47C7D5, 0x0000BFF2 //A10data8 0xD3065DEF4CE9AD32, 0x00003FF3 //A9data8 0x82C44125F568D54E, 0x0000BFF5 //A8data8 0x88D588729BAF14CA, 0x00003FF6 //A7data8 0xF4CA0661307243C7, 0x0000BFF6 //A6data8 0xB998746D57061F74, 0x00003FF7 //A5

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -