⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 node37.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 2 页
字号:
limitations.
<P>
<LI>
Design a communication structure for the algorithm Floyd 2 discussed
in Section <A HREF="msgs0.htm#24" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#24">3.9.1</A>.
<P>
<LI>
<A NAME=exmod>&#160;</A>
Assume that a cyclic mapping is used in the atmosphere model of
Section <A HREF="node20.html#secclim" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node20.html#secclim">2.6</A> to compensate for load imbalances.  Develop an
analytic expression for the additional communication cost associated
with various block sizes and hence for the load imbalance that must
exist for this approach to be worthwhile.
<P>
<LI>
Implement a two-dimensional finite difference algorithm using a nine-point
stencil.  Use this program to verify experimentally the analysis of
Exercise <A HREF="node37.html#exmod" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#exmod">17</A>.  Simulate load imbalance by calling a ``work''
function that performs different amounts of computation at different
grid points.
<P>
<LI>
<A NAME=exgr1>&#160;</A>
Assume that <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img711.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img711.gif">, <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img712.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img712.gif">, and <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img713.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img713.gif">sec.  Use the
performance models summarized in Table <A HREF="msgs0.htm#24" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#24">3.7</A> to determine
the values of <em> N</em>
 and <em> P</em>
 for which the various shortest-path
algorithms of Section <A HREF="msgs0.htm#24" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#24">3.9</A> are optimal.
<P>
<LI>
Assume that a graph represented by an adjacency matrix of size <IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img714.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img714.gif">
is distributed among <em> P</em>
 tasks prior to the execution of the
all-pairs shortest-path algorithm.  Repeat the analysis of
Exercise <A HREF="node37.html#exgr1" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#exgr1">19</A> but allow for the cost of data replication
in the Dijkstra algorithms.
<P>
<LI>
Extend the performance models developed for the shortest-path
algorithms to take into account bandwidth limitations on a 1-D mesh
architecture.
<P>
<LI>
Implement algorithms Floyd 1 and Floyd 2, and compare their
performance with that predicted by Equations <A HREF="msgs0.htm#24" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#24">3.12</A>
and <A HREF="msgs0.htm#24" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#24">3.13</A>.  Account for any differences.
<P>
<LI>
In so-called nondirect Fock matrix construction algorithms, the <IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img715.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img715.gif">
integrals of Equation <A HREF="node22.html#eqnfock" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node22.html#eqnfock">2.3</A> are cached on disk and reused at
each step.  Discuss the performance issues that may arise when
developing a code of this sort.
<P>
<LI>
The <em> bisection width
 </em> of a computer is the minimum number of
<A NAME=4288>&#160;</A>
wires that must be cut to divide the computer into two equal parts.
Multiplying this by the channel bandwidth gives the <em> bisection
<A NAME=4289>&#160;</A>
bandwidth</em>.  For example, the bisection bandwidth of a 1-D mesh with
bidirectional connections is 2/<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img716.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img716.gif">.  Determine the bisection
bandwidth of a bus, 2-D mesh, 3-D mesh, and hypercube.
<P>
<P><A NAME=5440>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img717.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img717.gif">
<BR><STRONG>Figure 3.29:</STRONG> <em> Parallel matrix transpose of a matrix <em> A</em>
 decomposed
by column, with <em> P=4</em>
.  The components of the matrix allocated to
a single task are shaded black, and the components required from other
tasks are stippled.</em><A NAME=figlatrans1>&#160;</A><BR>
<P>
<P>
<LI>
<A NAME=extrans2>&#160;</A>
An array transpose operation reorganizes an array partitioned in one
<A NAME=4296>&#160;</A>
dimension so that it is partitioned in the second dimension
<A NAME=4297>&#160;</A>
(Figure <A HREF="node37.html#figlatrans1" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#figlatrans1">3.29</A>).  This can be achieved in
<em> P-1</em>
 steps, with each processor exchanging <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img718.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img718.gif"> of its data with another
processor in each step.  Develop a performance model for this
operation.
<P>
<LI>
Equation <A HREF="node29.html#eq88" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node29.html#eq88">3.1</A> can be extended to account for the distance
<em> D</em>
 between originating and destination processors:
<P><A NAME=eq987>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img719.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img719.gif"><P>
The time per hop <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img720.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img720.gif"> typically has magnitude comparable to <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img721.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img721.gif">.
<A NAME=4305>&#160;</A>
Under what circumstances might the <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img722.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img722.gif"> term be significant?
<P>
<LI>
Develop a performance model for the matrix transpose algorithm on a
1-D mesh that takes into account per-hop costs, as specified by
Equation <A HREF="node37.html#eq987" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#eq987">3.14</A>.  Assume that <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img723.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img723.gif"> and <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img724.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img724.gif">, and
identify <em> P</em>
 and <em> N</em>
 values for which per-hop costs make a
significant (<em> &gt;5</em>
 percent) difference to execution time.
<P>
<LI>
<A NAME=extrans>&#160;</A>
Demonstrate that the transpose algorithm's messages travel a total of
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img725.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img725.gif"> hops on a 1-D mesh.  Use this information to refine the
performance model of Exercise <A HREF="node37.html#extrans2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#extrans2">25</A> to account for
competition for bandwidth.
<P>
<LI>
In the array transpose algorithm of Exercise <A HREF="node37.html#extrans2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node37.html#extrans2">25</A>, roughly
half of the array must be moved from one half of the computer to the
other.  Hence, we can obtain a lower time bound by dividing the data
volume by the bisection bandwidth.  Compare this bound with times
predicted by simple and bandwidth-limited performance models, on a
bus, one-dimensional mesh, and two-dimensional mesh.
<P>
<LI>
Implement the array transpose algorithm and study its performance.
Compare your results to the performance models developed in preceding
exercises.
<P>
</OL>
<P>

<BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html2297 HREF="node36.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node36.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html2305 HREF="node38.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node38.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html2303 HREF="node26.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node26.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html2307 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html2308 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html2306 HREF="node38.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node38.html"> Chapter Notes</A>
<B>Up:</B> <A NAME=tex2html2304 HREF="node26.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node26.html">3 A Quantitative Basis for Design</A>
<B> Previous:</B> <A NAME=tex2html2298 HREF="node36.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node36.html">3.10 Summary</A>
<BR><HR><P>
<P><ADDRESS>
<I>&#169 Copyright 1995 by <A href="msgs0.htm#6" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#6">Ian Foster</a></I>
</ADDRESS>
</BODY>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -