📄 e_powf.s
字号:
.file "powf.s"// Copyright (C) 2000, 2001, Intel Corporation// All rights reserved.//// Contributed 2/2/2000 by John Harrison, Ted Kubaska, Bob Norin, Shane Story,// and Ping Tak Peter Tang of the Computational Software Lab, Intel Corporation.//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.//// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY// OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.//// Intel Corporation is the author of this code, and requests that all// problem reports or change requests be submitted to it directly at// http://developer.intel.com/opensource.//// History//==============================================================// 2/02/00 Initial version// 2/03/00 Added p12 to definite over/under path. With odd power we did not// maintain the sign of x in this path.// 4/04/00 Unwind support added// 4/19/00 pow(+-1,inf) now returns NaN// pow(+-val, +-inf) returns 0 or inf, but now does not call error support// Added s1 to fcvt.fx because invalid flag was incorrectly set.// 8/15/00 Bundle added after call to __libm_error_support to properly// set [the previously overwritten] GR_Parameter_RESULT.// 9/07/00 Improved performance by eliminating bank conflicts and other stalls,// and tweaking the critical path// 9/08/00 Per c99, pow(+-1,inf) now returns 1, and pow(+1,nan) returns 1// 9/28/00 Updated NaN**0 path // 1/20/01 Fixed denormal flag settings.// 2/12/01 Improved speed.//// API//==============================================================// double pow(double)// float powf(float)//// Overview of operation//==============================================================//// Three steps...// 1. Log(x)// 2. y Log(x)// 3. exp(y log(x))// // This means we work with the absolute value of x and merge in the sign later.// Log(x) = G + delta + r -rsq/2 + p// G,delta depend on the exponent of x and table entries. The table entries are// indexed by the exponent of x, called K.// // The G and delta come out of the reduction; r is the reduced x.// // B = frcpa(x)// xB-1 is small means that B is the approximate inverse of x.// // Log(x) = Log( (1/B)(Bx) )// = Log(1/B) + Log(Bx)// = Log(1/B) + Log( 1 + (Bx-1))// // x = 2^K 1.x_1x_2.....x_52// B= frcpa(x) = 2^-k Cm // Log(1/B) = Log(1/(2^-K Cm))// Log(1/B) = Log((2^K/ Cm))// Log(1/B) = K Log(2) + Log(1/Cm)// // Log(x) = K Log(2) + Log(1/Cm) + Log( 1 + (Bx-1))// // If you take the significand of x, set the exponent to true 0, then Cm is// the frcpa. We tabulate the Log(1/Cm) values. There are 256 of them.// The frcpa table is indexed by 8 bits, the x_1 thru x_8.// m = x_1x_2...x_8 is an 8-bit index.// // Log(1/Cm) = log(1/frcpa(1+m/256)) where m goes from 0 to 255.// // We tabluate as two doubles, T and t, where T +t is the value itself.// // Log(x) = (K Log(2)_hi + T) + (Log(2)_hi + t) + Log( 1 + (Bx-1))// Log(x) = G + delta + Log( 1 + (Bx-1))// // The Log( 1 + (Bx-1)) can be calculated as a series in r = Bx-1.// // Log( 1 + (Bx-1)) = r - rsq/2 + p// // Then,// // yLog(x) = yG + y delta + y(r-rsq/2) + yp// yLog(x) = Z1 + e3 + Z2 + Z3 + (e2 + e3)// // // exp(yLog(x)) = exp(Z1 + Z2 + Z3) exp(e1 + e2 + e3)////// exp(Z3) is another series.// exp(e1 + e2 + e3) is approximated as f3 = 1 + (e1 + e2 + e3)//// Z1 (128/log2) = number of log2/128 in Z1 is N1// Z2 (128/log2) = number of log2/128 in Z2 is N2//// s1 = Z1 - N1 log2/128// s2 = Z2 - N2 log2/128//// s = s1 + s2// N = N1 + N2//// exp(Z1 + Z2) = exp(Z)// exp(Z) = exp(s) exp(N log2/128)//// exp(r) = exp(Z - N log2/128)//// r = s + d = (Z - N (log2/128)_hi) -N (log2/128)_lo// = Z - N (log2/128) //// Z = s+d +N (log2/128)//// exp(Z) = exp(s) (1+d) exp(N log2/128)//// N = M 128 + n//// N log2/128 = M log2 + n log2/128//// n is 8 binary digits = n_7n_6...n_1//// n log2/128 = n_7n_6n_5 16 log2/128 + n_4n_3n_2n_1 log2/128// n log2/128 = n_7n_6n_5 log2/8 + n_4n_3n_2n_1 log2/128// n log2/128 = I2 log2/8 + I1 log2/128//// N log2/128 = M log2 + I2 log2/8 + I1 log2/128 //// exp(Z) = exp(s) (1+d) exp(log(2^M) + log(2^I2/8) + log(2^I1/128))// exp(Z) = exp(s) (1+d1) (1+d2)(2^M) 2^I2/8 2^I1/128// exp(Z) = exp(s) f1 f2 (2^M) 2^I2/8 2^I1/128//// I1, I2 are table indices. Use a series for exp(s).// Then get exp(Z) //// exp(yLog(x)) = exp(Z1 + Z2 + Z3) exp(e1 + e2 + e3)// exp(yLog(x)) = exp(Z) exp(Z3) f3 // exp(yLog(x)) = exp(Z)f3 exp(Z3) // exp(yLog(x)) = A exp(Z3) //// We actually calculate exp(Z3) -1.// Then, // exp(yLog(x)) = A + A( exp(Z3) -1)//// Table Generation//==============================================================// The log values// ==============// The operation (K*log2_hi) must be exact. K is the true exponent of x.// If we allow gradual underflow (denormals), K can be represented in 12 bits// (as a two's complement number). We assume 13 bits as an engineering precaution.// // +------------+----------------+-+// | 13 bits | 50 bits | |// +------------+----------------+-+// 0 1 66// 2 34// // So we want the lsb(log2_hi) to be 2^-50// We get log2 as a quad-extended (15-bit exponent, 128-bit significand)// // 0 fffe b17217f7d1cf79ab c9e3b39803f2f6af (4...)// // Consider numbering the bits left to right, starting at 0 thru 127.// Bit 0 is the 2^-1 bit; bit 49 is the 2^-50 bit.// // ...79ab// 0111 1001 1010 1011// 44// 89// // So if we shift off the rightmost 14 bits, then (shift back only // the top half) we get// // 0 fffe b17217f7d1cf4000 e6af278ece600fcb dabc000000000000// // Put the right 64-bit signficand in an FR register, convert to double;// it is exact. Put the next 128 bits into a quad register and round to double.// The true exponent of the low part is -51.// // hi is 0 fffe b17217f7d1cf4000// lo is 0 ffcc e6af278ece601000// // Convert to double memory format and get// // hi is 0x3fe62e42fefa39e8// lo is 0x3cccd5e4f1d9cc02 // // log2_hi + log2_lo is an accurate value for log2.// // // The T and t values// ==================// A similar method is used to generate the T and t values.// // K * log2_hi + T must be exact.// // Smallest T,t// ----------// The smallest T,t is // T t// data8 0x3f60040155d58800, 0x3c93bce0ce3ddd81 log(1/frcpa(1+0/256))= +1.95503e-003// // The exponent is 0x3f6 (biased) or -9 (true).// For the smallest T value, what we want is to clip the significand such that// when it is shifted right by 9, its lsb is in the bit for 2^-51. The 9 is the specific // for the first entry. In general, it is 0xffff - (biased 15-bit exponent).// Independently, what we have calculated is the table value as a quad precision number.// Table entry 1 is// 0 fff6 80200aaeac44ef38 338f77605fdf8000// // We store this quad precision number in a data structure that is// sign: 1 // exponent: 15// signficand_hi: 64 (includes explicit bit)// signficand_lo: 49// Because the explicit bit is included, the significand is 113 bits.// // Consider significand_hi for table entry 1.// // // +-+--- ... -------+--------------------+// | |// +-+--- ... -------+--------------------+// 0 1 4444444455555555556666// 2345678901234567890123// // Labeled as above, bit 0 is 2^0, bit 1 is 2^-1, etc.// Bit 42 is 2^-42. If we shift to the right by 9, the bit in// bit 42 goes in 51.// // So what we want to do is shift bits 43 thru 63 into significand_lo.// This is shifting bit 42 into bit 63, taking care to retain the shifted-off bits.// Then shifting (just with signficaand_hi) back into bit 42. // // The shift_value is 63-42 = 21. In general, this is // 63 - (51 -(0xffff - 0xfff6))// For this example, it is// 63 - (51 - 9) = 63 - 42 = 21// // This means we are shifting 21 bits into significand_lo. We must maintain more// that a 128-bit signficand not to lose bits. So before the shift we put the 128-bit // significand into a 256-bit signficand and then shift.// The 256-bit significand has four parts: hh, hl, lh, and ll.// // Start off with// hh hl lh ll// <64> <49><15_0> <64_0> <64_0>// // After shift by 21 (then return for significand_hi),// <43><21_0> <21><43> <6><58_0> <64_0>// // Take the hh part and convert to a double. There is no rounding here.// The conversion is exact. The true exponent of the high part is the same as the// true exponent of the input quad.// // We have some 64 plus significand bits for the low part. In this example, we have// 70 bits. We want to round this to a double. Put them in a quad and then do a quad fnorm.// For this example the true exponent of the low part is // true_exponent_of_high - 43 = true_exponent_of_high - (64-21)// In general, this is // true_exponent_of_high - (64 - shift_value) // // // Largest T,t// ----------// The largest T,t is// data8 0x3fe62643fecf9742, 0x3c9e3147684bd37d log(1/frcpa(1+255/256))= +6.92171e-001// // Table entry 256 is// 0 fffe b1321ff67cba178c 51da12f4df5a0000// // The shift value is // 63 - (51 -(0xffff - 0xfffe)) = 13// // The true exponent of the low part is // true_exponent_of_high - (64 - shift_value)// -1 - (64-13) = -52// Biased as a double, this is 0x3cb// // // // So then lsb(T) must be >= 2^-51// msb(Klog2_hi) <= 2^12// // +--------+---------+// | 51 bits | <== largest T// +--------+---------+// | 9 bits | 42 bits | <== smallest T// +------------+----------------+-+// | 13 bits | 50 bits | |// +------------+----------------+-+// Special Cases//==============================================================// double float// overflow error 24 30// underflow error 25 31// X zero Y zero// +0 +0 +1 error 26 32// -0 +0 +1 error 26 32// +0 -0 +1 error 26 32// -0 -0 +1 error 26 32// X zero Y negative// +0 -odd integer +inf error 27 33 divide-by-zero// -0 -odd integer -inf error 27 33 divide-by-zero// +0 !-odd integer +inf error 27 33 divide-by-zero// -0 !-odd integer +inf error 27 33 divide-by-zero// +0 -inf +inf error 27 33 divide-by-zero// -0 -inf +inf error 27 33 divide-by-zero// X zero Y positve// +0 +odd integer +0// -0 +odd integer -0// +0 !+odd integer +0// -0 !+odd integer +0// +0 +inf +0// -0 +inf +0// +0 Y NaN quiet Y invalid if Y SNaN// -0 Y NaN quiet Y invalid if Y SNaN// X one// -1 Y inf +1// -1 Y NaN quiet Y invalid if Y SNaN// +1 Y NaN +1 invalid if Y SNaN// +1 Y any else +1// X - Y not integer QNAN error 28 34 invalid// X NaN Y 0 +1 error 29 35// X NaN Y NaN quiet X invalid if X or Y SNaN// X NaN Y any else quiet X invalid if X SNaN// X !+1 Y NaN quiet Y invalid if Y SNaN// X +inf Y >0 +inf// X -inf Y >0, !odd integer +inf// X -inf Y >0, odd integer -inf
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -