📄 97.html
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <style type="text/css"> body { font-family: Verdana, Arial, Helvetica, sans-serif;} a.at-term { font-style: italic; } </style> <title>Case Study: The Laplace Solver</title> <meta name="Generator" content="ATutor"> <meta name="Keywords" content=""></head><body> <p>The scaling behavior of the Laplace solver code depends in a complicated way on the hardware platform, the specific implementation of MPI and OpenMP, and the detailed balance of MPI processes and OpenMP threads per MPI process. </p>
<p>We will consider two different machines: </p>
<ol>
<p>
<li>An IBM SP, with SMP nodes consisting of eight Power3 CPUs running at 375 MHz. Inter-node communication is provided by a high-bandwidth network capable of 300 MB/second (bi-directional) per node. Intra-node memory bandwidth is high: more than 14 GB/second (under particular assumptions about memory bank population). </p>
<li>An Intel-based cluster at OSC consisting of 550 MHz Pentium III Xeon CPUs arranged in four-way SMP nodes. Here the node interconnect is provided by Myrinet and has high bandwidth and low latency. The principal hardware limitation
of this system is that the intra-node memory bandwidth is rather low. In practice four PEs may not work much faster than two, if the application requires sustained memory bandwidth. Effective cache use (e.g., cache blocking) is especially helpful on this system, to relieve the pressure for bandwidth to main memory.
</ol></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -