📄 e_atanhf.s
字号:
.file "atanhf.s"// Copyright (c) 2000 - 2003, Intel Corporation// All rights reserved.//// Contributed 2000 by the Intel Numerics Group, Intel Corporation//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY// OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.//// Intel Corporation is the author of this code, and requests that all// problem reports or change requests be submitted to it directly at// http://www.intel.com/software/products/opensource/libraries/num.htm.//// History//==============================================================// 05/22/01 Initial version// 05/20/02 Cleaned up namespace and sf0 syntax// 08/06/02 Improved Itanium 2 performance// 02/06/03 Reordered header: .section, .global, .proc, .align// 05/26/03 Improved performance, fixed to handle unorms//// API//==============================================================// float atanhf(float)//// Overview of operation//==============================================================// Background////// There are 7 paths:// 1. x = +/-0.0// Return atanhf(x) = +/-0.0//// 2. 0.0 < |x| <= MAX_DENORMAL_ABS// Return atanhf(x) = x + sign(x)*x^2//// 3. MAX_DENORMAL_ABS < |x| < 2^(-20)// Return atanhf(x) = Pol3(x), where Pol3(x) = x + x^3//// 4. 2^(-20) <= |x| < 1// Return atanhf(x) = 0.5 * (log(1 + x) - log(1 - x))// Algorithm description for log function see below.//// 5. |x| = 1// Return atanhf(x) = sign(x) * +INF//// 6. 1 < |x| <= +INF// Return atanhf(x) = QNaN//// 7. x = [S,Q]NaN// Return atanhf(x) = QNaN////==============================================================// Algorithm Description for log(x) function//// Consider x = 2^N * 1.f1 f2 f3 f4...f63// log(x) = log(x * frcpa(x) / frcpa(x))// = log(x * frcpa(x)) + log(1/frcpa(x))// = log(x * frcpa(x)) - log(frcpa(x))//// frcpa(x) = 2^(-N) * frcpa(1.f1 f2 ... f63)//// -log(frcpa(x)) = -log(C)// = -log(2^(-N)) - log(frcpa(1.f1 f2 ... f63))//// -log(frcpa(x)) = -log(C)// = N*log2 - log(frcpa(1.f1 f2 ... f63))////// log(x) = log(1/frcpa(x)) + log(frcpa(x) x)//// log(x) = N*log2 + log(1./frcpa(1.f1 f2 ... f63)) + log(x * frcpa(x))// log(x) = N*log2 + T + log(frcpa(x) x)//// Log(x) = N*log2 + T + log(C * x)//// C * x = 1 + r//// log(x) = N*log2 + T + log(1 + r)// log(x) = N*log2 + T + Series(r)//// 1.f1 f2 ... f8 has 256 entries.// They are 1 + k/2^8, k = 0 ... 255// These 256 values are the table entries.//// Implementation//==============================================================// C = frcpa(x)// r = C * x - 1//// Form rseries = r + P1*r^2 + P2*r^3 + P3*r^4//// x = f * 2*N where f is 1.f_1f_2f_3...f_63// Nfloat = float(n) where n is the true unbiased exponent// pre-index = f_1f_2....f_8// index = pre_index * 16// get the dxt table entry at index + offset = T//// result = (T + Nfloat * log(2)) + rseries//// The T table is calculated as follows// Form x_k = 1 + k/2^8 where k goes from 0... 255// y_k = frcpa(x_k)// log(1/y_k) in quad and round to double-extended// Registers used//==============================================================// Floating Point registers used:// f8, input// f32 -> f59// General registers used:// r14 -> r29, r32 -> r39// Predicate registers used:// p6 -> p9// p6 to filter out case when |x| >= 1// p7 to filter out case when x = [Q,S]NaN or +/-0// p8 to filter out case when |x| < 2^(-20)// p9 to filter out case when x = denormal// Assembly macros//==============================================================DataPtr = r14RcpTablePtrM = r15RcpTablePtrP = r16rExpbMask = r17rBias = r18rNearZeroBound = r19rArgSExpb = r20rArgExpb = r21rExpbm = r22rExpbp = r23rSigm = r24rSigp = r25rNm = r26rNp = r27rIndm = r28rIndp = r29GR_SAVE_B0 = r33GR_SAVE_GP = r34GR_SAVE_PFS = r35GR_Parameter_X = r36GR_Parameter_Y = r37GR_Parameter_RESULT = r38atanh_GR_tag = r39//==============================================================fOneMx = f33fOnePx = f34fRm2 = f35fRm3 = f36fRp2 = f37fRp3 = f38fRcpM = f39fRcpP = f40fRp = f41fRm = f42fN4CvtM = f43fN4CvtP = f44fNm = f45fNp = f46fLogTm = f47fLogTp = f48fLog2 = f49fArgAbs = f50fNormX = f50fP32m = f51fP32p = f52fP10m = f53fP10p = f54fX2 = f55fP3 = f56fP2 = f57fP1 = f58fHalf = f59// Data tables//==============================================================RODATA.align 16LOCAL_OBJECT_START(atanhf_data)data8 0xbfc0001008f39d59 // P3*0.5data8 0x3fc5556073e0c45a // P2*0.5data8 0xbfcffffffffaea15 // P1*0.5data8 0x3fe0000000000000 // 0.5data8 0x3fd62e42fefa39ef // 0.5*ln(2)data8 0x0000000000000000 // padLOCAL_OBJECT_END(atanhf_data)LOCAL_OBJECT_START(atanhf_data2)data8 0x3f50040155d5889e //log(1/frcpa(1+0/256))/2data8 0x3f68121214586b54 //log(1/frcpa(1+1/256))/2data8 0x3f741929f96832f0 //log(1/frcpa(1+2/256))/2data8 0x3f7c317384c75f06 //log(1/frcpa(1+3/256))/2data8 0x3f81a6b91ac73386 //log(1/frcpa(1+4/256))/2data8 0x3f85ba9a5d9ac039 //log(1/frcpa(1+5/256))/2data8 0x3f89d2a8074325f4 //log(1/frcpa(1+6/256))/2data8 0x3f8d6b2725979802 //log(1/frcpa(1+7/256))/2data8 0x3f90c58fa19dfaaa //log(1/frcpa(1+8/256))/2data8 0x3f92954c78cbce1b //log(1/frcpa(1+9/256))/2data8 0x3f94a94d2da96c56 //log(1/frcpa(1+10/256))/2data8 0x3f967c94f2d4bb58 //log(1/frcpa(1+11/256))/2data8 0x3f985188b630f068 //log(1/frcpa(1+12/256))/2data8 0x3f9a6b8abe73af4c //log(1/frcpa(1+13/256))/2data8 0x3f9c441e06f72a9e //log(1/frcpa(1+14/256))/2data8 0x3f9e1e6713606d07 //log(1/frcpa(1+15/256))/2data8 0x3f9ffa6911ab9301 //log(1/frcpa(1+16/256))/2data8 0x3fa0ec139c5da601 //log(1/frcpa(1+17/256))/2data8 0x3fa1dbd2643d190b //log(1/frcpa(1+18/256))/2data8 0x3fa2cc7284fe5f1c //log(1/frcpa(1+19/256))/2data8 0x3fa3bdf5a7d1ee64 //log(1/frcpa(1+20/256))/2data8 0x3fa4b05d7aa012e0 //log(1/frcpa(1+21/256))/2data8 0x3fa580db7ceb5702 //log(1/frcpa(1+22/256))/2data8 0x3fa674f089365a7a //log(1/frcpa(1+23/256))/2data8 0x3fa769ef2c6b568d //log(1/frcpa(1+24/256))/2data8 0x3fa85fd927506a48 //log(1/frcpa(1+25/256))/2data8 0x3fa9335e5d594989 //log(1/frcpa(1+26/256))/2data8 0x3faa2b0220c8e5f5 //log(1/frcpa(1+27/256))/2data8 0x3fab0004ac1a86ac //log(1/frcpa(1+28/256))/2data8 0x3fabf968769fca11 //log(1/frcpa(1+29/256))/2data8 0x3faccfedbfee13a8 //log(1/frcpa(1+30/256))/2data8 0x3fada727638446a2 //log(1/frcpa(1+31/256))/2data8 0x3faea3257fe10f7a //log(1/frcpa(1+32/256))/2data8 0x3faf7be9fedbfde6 //log(1/frcpa(1+33/256))/2data8 0x3fb02ab352ff25f4 //log(1/frcpa(1+34/256))/2data8 0x3fb097ce579d204d //log(1/frcpa(1+35/256))/2data8 0x3fb1178e8227e47c //log(1/frcpa(1+36/256))/2data8 0x3fb185747dbecf34 //log(1/frcpa(1+37/256))/2data8 0x3fb1f3b925f25d41 //log(1/frcpa(1+38/256))/2data8 0x3fb2625d1e6ddf57 //log(1/frcpa(1+39/256))/2data8 0x3fb2d1610c86813a //log(1/frcpa(1+40/256))/2data8 0x3fb340c59741142e //log(1/frcpa(1+41/256))/2data8 0x3fb3b08b6757f2a9 //log(1/frcpa(1+42/256))/2data8 0x3fb40dfb08378003 //log(1/frcpa(1+43/256))/2data8 0x3fb47e74e8ca5f7c //log(1/frcpa(1+44/256))/2data8 0x3fb4ef51f6466de4 //log(1/frcpa(1+45/256))/2data8 0x3fb56092e02ba516 //log(1/frcpa(1+46/256))/2data8 0x3fb5d23857cd74d5 //log(1/frcpa(1+47/256))/2data8 0x3fb6313a37335d76 //log(1/frcpa(1+48/256))/2data8 0x3fb6a399dabbd383 //log(1/frcpa(1+49/256))/2data8 0x3fb70337dd3ce41b //log(1/frcpa(1+50/256))/2data8 0x3fb77654128f6127 //log(1/frcpa(1+51/256))/2data8 0x3fb7e9d82a0b022d //log(1/frcpa(1+52/256))/2data8 0x3fb84a6b759f512f //log(1/frcpa(1+53/256))/2data8 0x3fb8ab47d5f5a310 //log(1/frcpa(1+54/256))/2data8 0x3fb91fe49096581b //log(1/frcpa(1+55/256))/2data8 0x3fb981634011aa75 //log(1/frcpa(1+56/256))/2data8 0x3fb9f6c407089664 //log(1/frcpa(1+57/256))/2data8 0x3fba58e729348f43 //log(1/frcpa(1+58/256))/2data8 0x3fbabb55c31693ad //log(1/frcpa(1+59/256))/2data8 0x3fbb1e104919efd0 //log(1/frcpa(1+60/256))/2data8 0x3fbb94ee93e367cb //log(1/frcpa(1+61/256))/2data8 0x3fbbf851c067555f //log(1/frcpa(1+62/256))/2data8 0x3fbc5c0254bf23a6 //log(1/frcpa(1+63/256))/2data8 0x3fbcc000c9db3c52 //log(1/frcpa(1+64/256))/2data8 0x3fbd244d99c85674 //log(1/frcpa(1+65/256))/2data8 0x3fbd88e93fb2f450 //log(1/frcpa(1+66/256))/2data8 0x3fbdedd437eaef01 //log(1/frcpa(1+67/256))/2data8 0x3fbe530effe71012 //log(1/frcpa(1+68/256))/2data8 0x3fbeb89a1648b971 //log(1/frcpa(1+69/256))/2data8 0x3fbf1e75fadf9bde //log(1/frcpa(1+70/256))/2data8 0x3fbf84a32ead7c35 //log(1/frcpa(1+71/256))/2data8 0x3fbfeb2233ea07cd //log(1/frcpa(1+72/256))/2data8 0x3fc028f9c7035c1c //log(1/frcpa(1+73/256))/2data8 0x3fc05c8be0d9635a //log(1/frcpa(1+74/256))/2data8 0x3fc085eb8f8ae797 //log(1/frcpa(1+75/256))/2data8 0x3fc0b9c8e32d1911 //log(1/frcpa(1+76/256))/2data8 0x3fc0edd060b78081 //log(1/frcpa(1+77/256))/2data8 0x3fc122024cf0063f //log(1/frcpa(1+78/256))/2data8 0x3fc14be2927aecd4 //log(1/frcpa(1+79/256))/2data8 0x3fc180618ef18adf //log(1/frcpa(1+80/256))/2data8 0x3fc1b50bbe2fc63b //log(1/frcpa(1+81/256))/2data8 0x3fc1df4cc7cf242d //log(1/frcpa(1+82/256))/2data8 0x3fc214456d0eb8d4 //log(1/frcpa(1+83/256))/2data8 0x3fc23ec5991eba49 //log(1/frcpa(1+84/256))/2data8 0x3fc2740d9f870afb //log(1/frcpa(1+85/256))/2data8 0x3fc29ecdabcdfa04 //log(1/frcpa(1+86/256))/2data8 0x3fc2d46602adccee //log(1/frcpa(1+87/256))/2data8 0x3fc2ff66b04ea9d4 //log(1/frcpa(1+88/256))/2data8 0x3fc335504b355a37 //log(1/frcpa(1+89/256))/2data8 0x3fc360925ec44f5d //log(1/frcpa(1+90/256))/2data8 0x3fc38bf1c3337e75 //log(1/frcpa(1+91/256))/2data8 0x3fc3c25277333184 //log(1/frcpa(1+92/256))/2data8 0x3fc3edf463c1683e //log(1/frcpa(1+93/256))/2data8 0x3fc419b423d5e8c7 //log(1/frcpa(1+94/256))/2data8 0x3fc44591e0539f49 //log(1/frcpa(1+95/256))/2data8 0x3fc47c9175b6f0ad //log(1/frcpa(1+96/256))/2data8 0x3fc4a8b341552b09 //log(1/frcpa(1+97/256))/2data8 0x3fc4d4f3908901a0 //log(1/frcpa(1+98/256))/2data8 0x3fc501528da1f968 //log(1/frcpa(1+99/256))/2data8 0x3fc52dd06347d4f6 //log(1/frcpa(1+100/256))/2data8 0x3fc55a6d3c7b8a8a //log(1/frcpa(1+101/256))/2data8 0x3fc5925d2b112a59 //log(1/frcpa(1+102/256))/2data8 0x3fc5bf406b543db2 //log(1/frcpa(1+103/256))/2data8 0x3fc5ec433d5c35ae //log(1/frcpa(1+104/256))/2data8 0x3fc61965cdb02c1f //log(1/frcpa(1+105/256))/2data8 0x3fc646a84935b2a2 //log(1/frcpa(1+106/256))/2data8 0x3fc6740add31de94 //log(1/frcpa(1+107/256))/2data8 0x3fc6a18db74a58c5 //log(1/frcpa(1+108/256))/2data8 0x3fc6cf31058670ec //log(1/frcpa(1+109/256))/2data8 0x3fc6f180e852f0ba //log(1/frcpa(1+110/256))/2data8 0x3fc71f5d71b894f0 //log(1/frcpa(1+111/256))/2data8 0x3fc74d5aefd66d5c //log(1/frcpa(1+112/256))/2data8 0x3fc77b79922bd37e //log(1/frcpa(1+113/256))/2data8 0x3fc7a9b9889f19e2 //log(1/frcpa(1+114/256))/2data8 0x3fc7d81b037eb6a6 //log(1/frcpa(1+115/256))/2data8 0x3fc8069e33827231 //log(1/frcpa(1+116/256))/2data8 0x3fc82996d3ef8bcb //log(1/frcpa(1+117/256))/2data8 0x3fc85855776dcbfb //log(1/frcpa(1+118/256))/2data8 0x3fc8873658327ccf //log(1/frcpa(1+119/256))/2data8 0x3fc8aa75973ab8cf //log(1/frcpa(1+120/256))/2data8 0x3fc8d992dc8824e5 //log(1/frcpa(1+121/256))/2data8 0x3fc908d2ea7d9512 //log(1/frcpa(1+122/256))/2data8 0x3fc92c59e79c0e56 //log(1/frcpa(1+123/256))/2data8 0x3fc95bd750ee3ed3 //log(1/frcpa(1+124/256))/2data8 0x3fc98b7811a3ee5b //log(1/frcpa(1+125/256))/2data8 0x3fc9af47f33d406c //log(1/frcpa(1+126/256))/2data8 0x3fc9df270c1914a8 //log(1/frcpa(1+127/256))/2data8 0x3fca0325ed14fda4 //log(1/frcpa(1+128/256))/2data8 0x3fca33440224fa79 //log(1/frcpa(1+129/256))/2data8 0x3fca57725e80c383 //log(1/frcpa(1+130/256))/2data8 0x3fca87d0165dd199 //log(1/frcpa(1+131/256))/2data8 0x3fcaac2e6c03f896 //log(1/frcpa(1+132/256))/2data8 0x3fcadccc6fdf6a81 //log(1/frcpa(1+133/256))/2data8 0x3fcb015b3eb1e790 //log(1/frcpa(1+134/256))/2data8 0x3fcb323a3a635948 //log(1/frcpa(1+135/256))/2data8 0x3fcb56fa04462909 //log(1/frcpa(1+136/256))/2data8 0x3fcb881aa659bc93 //log(1/frcpa(1+137/256))/2data8 0x3fcbad0bef3db165 //log(1/frcpa(1+138/256))/2data8 0x3fcbd21297781c2f //log(1/frcpa(1+139/256))/2data8 0x3fcc039236f08819 //log(1/frcpa(1+140/256))/2data8 0x3fcc28cb1e4d32fd //log(1/frcpa(1+141/256))/2data8 0x3fcc4e19b84723c2 //log(1/frcpa(1+142/256))/2data8 0x3fcc7ff9c74554c9 //log(1/frcpa(1+143/256))/2data8 0x3fcca57b64e9db05 //log(1/frcpa(1+144/256))/2data8 0x3fcccb130a5cebb0 //log(1/frcpa(1+145/256))/2data8 0x3fccf0c0d18f326f //log(1/frcpa(1+146/256))/2data8 0x3fcd232075b5a201 //log(1/frcpa(1+147/256))/2data8 0x3fcd490246defa6b //log(1/frcpa(1+148/256))/2data8 0x3fcd6efa918d25cd //log(1/frcpa(1+149/256))/2data8 0x3fcd9509707ae52f //log(1/frcpa(1+150/256))/2data8 0x3fcdbb2efe92c554 //log(1/frcpa(1+151/256))/2data8 0x3fcdee2f3445e4af //log(1/frcpa(1+152/256))/2data8 0x3fce148a1a2726ce //log(1/frcpa(1+153/256))/2data8 0x3fce3afc0a49ff40 //log(1/frcpa(1+154/256))/2data8 0x3fce6185206d516e //log(1/frcpa(1+155/256))/2data8 0x3fce882578823d52 //log(1/frcpa(1+156/256))/2data8 0x3fceaedd2eac990c //log(1/frcpa(1+157/256))/2data8 0x3fced5ac5f436be3 //log(1/frcpa(1+158/256))/2data8 0x3fcefc9326d16ab9 //log(1/frcpa(1+159/256))/2data8 0x3fcf2391a2157600 //log(1/frcpa(1+160/256))/2data8 0x3fcf4aa7ee03192d //log(1/frcpa(1+161/256))/2data8 0x3fcf71d627c30bb0 //log(1/frcpa(1+162/256))/2data8 0x3fcf991c6cb3b379 //log(1/frcpa(1+163/256))/2data8 0x3fcfc07ada69a910 //log(1/frcpa(1+164/256))/2data8 0x3fcfe7f18eb03d3e //log(1/frcpa(1+165/256))/2data8 0x3fd007c053c5002e //log(1/frcpa(1+166/256))/2data8 0x3fd01b942198a5a1 //log(1/frcpa(1+167/256))/2data8 0x3fd02f74400c64eb //log(1/frcpa(1+168/256))/2data8 0x3fd04360be7603ad //log(1/frcpa(1+169/256))/2data8 0x3fd05759ac47fe34 //log(1/frcpa(1+170/256))/2data8 0x3fd06b5f1911cf52 //log(1/frcpa(1+171/256))/2data8 0x3fd078bf0533c568 //log(1/frcpa(1+172/256))/2data8 0x3fd08cd9687e7b0e //log(1/frcpa(1+173/256))/2data8 0x3fd0a10074cf9019 //log(1/frcpa(1+174/256))/2data8 0x3fd0b5343a234477 //log(1/frcpa(1+175/256))/2data8 0x3fd0c974c89431ce //log(1/frcpa(1+176/256))/2data8 0x3fd0ddc2305b9886 //log(1/frcpa(1+177/256))/2data8 0x3fd0eb524bafc918 //log(1/frcpa(1+178/256))/2data8 0x3fd0ffb54213a476 //log(1/frcpa(1+179/256))/2data8 0x3fd114253da97d9f //log(1/frcpa(1+180/256))/2data8 0x3fd128a24f1d9aff //log(1/frcpa(1+181/256))/2data8 0x3fd1365252bf0865 //log(1/frcpa(1+182/256))/2data8 0x3fd14ae558b4a92d //log(1/frcpa(1+183/256))/2data8 0x3fd15f85a19c765b //log(1/frcpa(1+184/256))/2data8 0x3fd16d4d38c119fa //log(1/frcpa(1+185/256))/2data8 0x3fd18203c20dd133 //log(1/frcpa(1+186/256))/2data8 0x3fd196c7bc4b1f3b //log(1/frcpa(1+187/256))/2data8 0x3fd1a4a738b7a33c //log(1/frcpa(1+188/256))/2data8 0x3fd1b981c0c9653d //log(1/frcpa(1+189/256))/2data8 0x3fd1ce69e8bb106b //log(1/frcpa(1+190/256))/2data8 0x3fd1dc619de06944 //log(1/frcpa(1+191/256))/2data8 0x3fd1f160a2ad0da4 //log(1/frcpa(1+192/256))/2data8 0x3fd2066d7740737e //log(1/frcpa(1+193/256))/2data8 0x3fd2147dba47a394 //log(1/frcpa(1+194/256))/2data8 0x3fd229a1bc5ebac3 //log(1/frcpa(1+195/256))/2
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -