readme.carefully

来自「openmp版的banchmark」· CAREFULLY 代码 · 共 17 行

CAREFULLY
17
字号
Note: please observe that in the routine conj_grad three implementations of the sparse matrix-vector multiply havebeen supplied.  The default matrix-vector multiply is notloop unrolled.  The alternate implementations are unrolledto a depth of 2 and unrolled to a depth of 8.  Pleaseexperiment with these to find the fastest for your particulararchitecture.  If reporting timing results, any of these three maybe used without penalty.Performance examples:The non-unrolled version of the multiply is actually (slightly: maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the unrolled-by-2 version below.   On the Cray t3d, the reverse is true, i.e., the unrolled-by-two version is some 10% faster.  The unrolled-by-8 version below is significantly fasteron the Cray t3d - overall speed of code is 1.5 times faster.

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?