performance.qbk

来自「Boost provides free peer-reviewed portab」· QBK 代码 · 共 370 行 · 第 1/2 页
QBK
370 行
[template perf[name value] [value]][template para[text] '''<para>'''[text]'''</para>'''][section:perf Performance][section:perf_over Performance Overview][performance_overview][endsect][section:interp Interpreting these Results]In all of the following tables, the best performingresult in each row, is assigned a relative value of "1" and shownin bold, so a score of "2" means ['"twice as slow as the best performing result".]  Actual timings in seconds per function callare also shown in parenthesis.  Result were obtained on a systemwith an Intel 2.8GHz Pentium 4 processor with 2Gb of RAM and runningeither Windows XP or Mandriva Linux.  [caution As usualwith performance results these should be taken with a large pinchof salt: relative performance is known to shift quite a bit dependingupon the architecture of the particular test system used.  Furthermore, our performance results were obtained using our own test data:these test values are designed to provide good coverage of our code and testall the appropriate corner cases.  They do not necessarily represent"typical" usage: whatever that may be!][endsect][section:getting_best Getting the Best Performance from this Library]By far the most important thing you can do when using this libraryis turn on your compiler's optimisation options.  As the following table shows the penalty for using the library in debug mode can be quite large.  [table  Performance Comparison of Release and Debug Settings[[Function]      [Microsoft Visual C++ 8.0            Debug Settings: /Od /ZI      ]         [Microsoft Visual C++ 8.0            Release settings: /Ox /arch:SSE2         ]][[__erf][[perf msvc-debug-erf..[para 16.65][para (1.028e-006s)]]][[perf msvc-erf..[para *1.00*][para (6.173e-008s)]]]][[__erf_inv][[perf msvc-debug-erf_inv..[para 19.28][para (1.215e-006s)]]][[perf msvc-erf_inv..[para *1.00*][para (6.302e-008s)]]]][[__ibeta and __ibetac][[perf msvc-debug-ibeta..[para 8.32][para (1.540e-005s)]]][[perf msvc-ibeta..[para *1.00*][para (1.852e-006s)]]]][[__ibeta_inv and __ibetac_inv][[perf msvc-debug-ibeta_inv..[para 10.25][para (7.492e-005s)]]][[perf msvc-ibeta_inv..[para *1.00*][para (7.311e-006s)]]]][[__ibeta_inva, __ibetac_inva, __ibeta_invb and __ibetac_invb][[perf msvc-debug-ibeta_invab..[para 8.57][para (2.441e-004s)]]][[perf msvc-ibeta_invab..[para *1.00*][para (2.847e-005s)]]]][[__gamma_p and __gamma_q][[perf msvc-debug-igamma..[para 10.98][para (1.044e-005s)]]][[perf msvc-igamma..[para *1.00*][para (9.504e-007s)]]]][[__gamma_p_inv and __gamma_q_inv][[perf msvc-debug-igamma_inv..[para 10.25][para (3.721e-005s)]]][[perf msvc-igamma_inv..[para *1.00*][para (3.631e-006s)]]]][[__gamma_p_inva and __gamma_q_inva][[perf msvc-debug-igamma_inva..[para 11.26][para (1.124e-004s)]]][[perf msvc-igamma_inva..[para *1.00*][para (9.982e-006s)]]]]][endsect][section:comp_compilers Comparing Compilers]After a good choice of build settings the next most important thing you can do, is choose your compiler- and the standard C library it sits on top of - very carefully.  GCC-3.xin particular has been found to be particularly bad at inlining code, and performing the kinds of high level transformations that good C++ performancedemands (thankfully GCC-4.x is somewhat better in this respect).[table  Performance Comparison of Various Windows Compilers[[Function]   [Intel C++ 10.0      ( /Ox /Qipo /QxN )   ]      [Microsoft Visual C++ 8.0            ( /Ox /arch:SSE2 )      ]      [Cygwin G++ 3.4            ( /O3 )      ]][[__erf][[perf intel-erf..[para *1.00*][para (4.118e-008s)]]][[perf msvc-erf..[para 1.50][para (6.173e-008s)]]][[perf gcc-erf..[para 3.24][para (1.336e-007s)]]]][[__erf_inv][[perf intel-erf_inv..[para *1.00*][para (4.439e-008s)]]][[perf msvc-erf_inv..[para 1.42][para (6.302e-008s)]]][[perf gcc-erf_inv..[para 7.88][para (3.500e-007s)]]]][[__ibeta and __ibetac][[perf intel-ibeta..[para *1.00*][para (1.631e-006s)]]][[perf msvc-ibeta..[para 1.14][para (1.852e-006s)]]][[perf gcc-ibeta..[para 3.05][para (4.975e-006s)]]]][[__ibeta_inv and __ibetac_inv][[perf intel-ibeta_inv..[para *1.00*][para (6.133e-006s)]]][[perf msvc-ibeta_inv..[para 1.19][para (7.311e-006s)]]][[perf gcc-ibeta_inv..[para 2.60][para (1.597e-005s)]]]][[__ibeta_inva, __ibetac_inva, __ibeta_invb and __ibetac_invb][[perf intel-ibeta_invab..[para *1.00*][para (2.453e-005s)]]][[perf msvc-ibeta_invab..[para 1.16][para (2.847e-005s)]]][[perf gcc-ibeta_invab..[para 2.83][para (6.947e-005s)]]]][[__gamma_p and __gamma_q][[perf intel-igamma..[para *1.00*][para (6.735e-007s)]]][[perf msvc-igamma..[para 1.41][para (9.504e-007s)]]][[perf gcc-igamma..[para 2.78][para (1.872e-006s)]]]][[__gamma_p_inv and __gamma_q_inv][[perf intel-igamma_inv..[para *1.00*][para (2.637e-006s)]]][[perf msvc-igamma_inv..[para 1.38][para (3.631e-006s)]]][[perf gcc-igamma_inv..[para 3.31][para (8.736e-006s)]]]][[__gamma_p_inva and __gamma_q_inva][[perf intel-igamma_inva..[para *1.00*][para (7.716e-006s)]]][[perf msvc-igamma_inva..[para 1.29][para (9.982e-006s)]]][[perf gcc-igamma_inva..[para 2.56][para (1.974e-005s)]]]]][endsect][section:tuning Performance Tuning Macros]There are a small number of performance tuning optionsthat are determined by configuration macros.  These should be setin boost/math/tools/user.hpp; or else reported to the Boost-developmentmailing list so that the appropriate option for a given compiler andOS platform can be set automatically in our configuration setup.  [table[[Macro][Meaning]][[BOOST_MATH_POLY_METHOD]   [Determines how polynomials and most rational functions   are evaluated.  Define to one   of the values 0, 1, 2 or 3: see below for the meaning of these values.]][[BOOST_MATH_RATIONAL_METHOD]   [Determines how symmetrical rational functions are evaluated: mostly   this only effects how the Lanczos approximation is evaluated, and how   the `evaluate_rational` function behaves.  Define to one   of the values 0, 1, 2 or 3: see below for the meaning of these values.   ]][[BOOST_MATH_MAX_POLY_ORDER]   [The maximum order of polynomial or rational function that will   be evaluated by a method other than 0 (a simple "for" loop).   ]][[BOOST_MATH_INT_TABLE_TYPE(RT, IT)]   [Many of the coefficients to the polynomials and rational functions   used by this library are integers.  Normally these are stored as tables   as integers, but if mixed integer / floating point arithmetic is much   slower than regular floating point arithmetic then they can be stored    as tables of floating point values instead.  If mixed arithmetic is slow   then add:         #define BOOST_MATH_INT_TABLE_TYPE(RT, IT) RT      to boost/math/tools/user.hpp, otherwise the default of:         #define BOOST_MATH_INT_TABLE_TYPE(RT, IT) IT      Set in boost/math/config.hpp is fine, and may well result in smaller   code.   ]]]The values to which `BOOST_MATH_POLY_METHOD` and `BOOST_MATH_RATIONAL_METHOD`may be set are as follows:[table[[Value][Effect]][[0][The polynomial or rational function is evaluated using Horner's      method, and a simple for-loop.              Note that if the order of the polynomial      or rational function is a runtime parameter, or the order is      greater than the value of `BOOST_MATH_MAX_POLY_ORDER`, then      this method is always used, irrespective of the value      of `BOOST_MATH_POLY_METHOD` or `BOOST_MATH_RATIONAL_METHOD`.]][[1][The polynomial or rational function is evaluated without      the use of a loop, and using Horner's method.  This only occurs      if the order of the polynomial is known at compile time and is less      than or equal to `BOOST_MATH_MAX_POLY_ORDER`. ]][[2][The polynomial or rational function is evaluated without      the use of a loop, and using a second order Horner's method.      In theory this permits two operations to occur in parallel      for polynomials, and four in parallel for rational functions.      This only occurs      if the order of the polynomial is known at compile time and is less      than or equal to `BOOST_MATH_MAX_POLY_ORDER`.]][[3][The polynomial or rational function is evaluated without      the use of a loop, and using a second order Horner's method.      In theory this permits two operations to occur in parallel      for polynomials, and four in parallel for rational functions.      This differs from method "2" in that the code is carefully ordered      to make the parallelisation more obvious to the compiler: rather than      relying on the compiler's optimiser to spot the parallelisation      opportunities.      This only occurs      if the order of the polynomial is known at compile time and is less      than or equal to `BOOST_MATH_MAX_POLY_ORDER`.]]]To determine which of these options is best for your particular compiler/platform buildthe performance test application with your usual release settings,and run the program with the --tune command line option.In practice the difference between methods is rather small at present,as the following table shows.  However, parallelisation /vectorisation
performance.qbk - 源码说明

本页面展示了「Boost provides free peer-reviewed portable C++ source libraries. We emphasize libraries that work」中的 performance.qbk 源码文件，采用 QBK 编程语言编写，共 370 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与libraries相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?