readme.carefully
来自「openmp版的banchmark」· CAREFULLY 代码 · 共 17 行
CAREFULLY
17 行
Note: please observe that in the routine conj_grad three implementations of the sparse matrix-vector multiply havebeen supplied. The default matrix-vector multiply is notloop unrolled. The alternate implementations are unrolledto a depth of 2 and unrolled to a depth of 8. Pleaseexperiment with these to find the fastest for your particulararchitecture. If reporting timing results, any of these three maybe used without penalty.Performance examples:The non-unrolled version of the multiply is actually (slightly: maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the unrolled-by-2 version below. On the Cray t3d, the reverse is true, i.e., the unrolled-by-two version is some 10% faster. The unrolled-by-8 version below is significantly fasteron the Cray t3d - overall speed of code is 1.5 times faster.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?