📄 18.html
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <style type="text/css"> body { font-family: Verdana, Arial, Helvetica, sans-serif;} a.at-term { font-style: italic; } </style> <title>Laplace Solver Performance Characteristics</title> <meta name="Generator" content="ATutor"> <meta name="Keywords" content=""></head><body> <p>
The following table shows the performance in MFLOPs (millions of floating point operations per second) for the two codes shown in the previous session on various platforms.
</p>
<table border=0 cellpadding="5">
<tr bgcolor="#000000">
<th><font color="#FFFFFF">System</font></th>
<th><font color="#FFFFFF">Approach</font></th>
<th><font color="#FFFFFF">MFLOPs</font></th>
<th><font color="#FFFFFF">MFLOPs Method</font></th>
</tr>
<tr>
<td bgcolor="#a9c6e2">Cray T94</td>
<td bgcolor="#e9e9e9">Vectorized</td>
<td bgcolor="#e9e9e9">647.57</td>
<td bgcolor="#e9e9e9"><code>hpm</code></td>
</tr>
<tr>
<td bgcolor="#a9c6e2">Cray SV1e (MSP)</td>
<td bgcolor="#e9e9e9">Vectorized</td>
<td bgcolor="#e9e9e9">613.87</td>
<td bgcolor="#e9e9e9"><code>hpm</code></td>
</tr>
<tr>
<td bgcolor="#a9c6e2">Intel Itanium 733MHz (OSC IA64 Cluster)</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">232.06</td>
<td bgcolor="#e9e9e9">Scaled from SV1e SSP</td>
</tr>
<tr>
<td bgcolor="#a9c6e2">Intel Pentium 4 Xeon 1.7GHz</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">225.69</td>
<td bgcolor="#e9e9e9">Scaled from SV1e SSP</td>
</tr>
<tr>
<td bgcolor="#a9c6e2">Cray SV1e (SSP)</td>
<td bgcolor="#e9e9e9">Vectorized</td>
<td bgcolor="#e9e9e9">220.98</td>
<td bgcolor="#e9e9e9"><code>hpm</code></td>
</tr>
<tr>
<td bgcolor="#a9c6e2">AMD Athlon 1.2GHz</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">143.62</td>
<td bgcolor="#e9e9e9">Scaled from SV1e SSP</td>
<tr>
<td bgcolor="#a9c6e2">MIPS R12000 300MHz (SGI Origin 2000)</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">141.89</td>
<td bgcolor="#e9e9e9"><code>perfex</code></td>
</tr>
<tr>
<td bgcolor="#a9c6e2">DEC Alpha 300MHz (Cray T3E-600)</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">73.82</td>
<td bgcolor="#e9e9e9"><code>pat</code></td>
</tr>
<tr>
<td bgcolor="#a9c6e2">Intel Pentium III Xeon 550MHz (OSC P3 Cluster)</td>
<td bgcolor="#e9e9e9">Cache-friendly</td>
<td bgcolor="#e9e9e9">60.38</td>
<td bgcolor="#e9e9e9"><code>lperfex</code></td>
</tr>
</table>
<p>
The performance of both versions of the Laplace solver code is bound primarily by available memory bandwidth, which is why the Cray T94 and SV1e MSP results are so high -- both of these systems have multi-GB/s memory systems. For the cache-friendly version of the code, L2 cache matters as well. This is why the MIPS R12k with its 8MB L2 cache is approximately twice as fast as the Alpha and the Pentium III Xeon, despite the fact that all three have similar peak floating point performance and single-processor memory bandwidth.
</p></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -