📄 readme.carefully
字号:
Note: please observe that in the routine conj_grad three implementations of the sparse matrix-vector multiply havebeen supplied. The default matrix-vector multiply is notloop unrolled. The alternate implementations are unrolledto a depth of 2 and unrolled to a depth of 8. Pleaseexperiment with these to find the fastest for your particulararchitecture. If reporting timing results, any of these three maybe used without penalty.Performance examples:The non-unrolled version of the multiply is actually (slightly: maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the unrolled-by-2 version below. On the Cray t3d, the reverse is true, i.e., the unrolled-by-two version is some 10% faster. The unrolled-by-8 version below is significantly fasteron the Cray t3d - overall speed of code is 1.5 times faster.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -