⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 node17.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 3 页
字号:
<P>
<P><A NAME=algdac>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img202.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img202.gif"><P>
<P>
<P><A NAME=2525>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img205.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img205.gif">
<BR><STRONG>Figure 2.8:</STRONG> <em> Tree structure for divide-and-conquer summation algorithm
with <em> N=8</em>
.  The <em> N</em>
 numbers located in the tasks at the
bottom of the diagram are communicated to the tasks in the row
immediately above; these each perform an addition and then forward the
result to the next level.  The complete sum is available at the root
of the tree after <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img204.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img204.gif"> steps.</em><A NAME=figsum1>&#160;</A><BR>
<P>
<P>
In summary, we observe that in developing an efficient parallel
summation algorithm, we have distributed the
<em> N-1</em>
 communication and computation operations required to perform the
summation and have modified the order in which these operations are
performed so that they can proceed concurrently.  The result is a
regular communication structure in which each task communicates with a
small set of neighbors.
<P>

<H2><A NAME=SECTION02333000000000000000>2.3.3 Unstructured and Dynamic Communication</A></H2>
<P>
<A NAME=seccommunstruc>&#160;</A>
<P>
<A NAME=1413>&#160;</A>
<P>
<P><A NAME=2543>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img206.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img206.gif">
<BR><STRONG>Figure 2.9:</STRONG> <em> Example of a problem requiring unstructured communication.
In this finite element mesh generated for an assembly part, each
vertex is a grid point.  An edge connecting two vertices represents a
data dependency that will require communication if the vertices are
located in different tasks.  Notice that different vertices have
varying numbers of neighbors.  (Image courtesy of M.
S. Shephard.)</em><A NAME=figfem>&#160;</A><BR>
<P>
<P>
The examples considered previously are all of static,
structured communication, in which a task's communication partners
form a regular pattern such as a tree or a grid and do not change over
time.  In other cases, communication patterns may be considerably more
<A NAME=1418>&#160;</A>
complex.  For example, in finite element methods used in engineering
<A NAME=1419>&#160;</A>
calculations, the computational grid may be shaped to follow an
irregular object or to provide high resolution in critical regions
(Figure <A HREF="node17.html#figfem" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node17.html#figfem">2.9</A>).  Here, the channel structure representing the
communication partners of each grid point is quite irregular and
data-dependent and, furthermore, may change over time if the grid is
refined as a simulation evolves.
<P>
<A NAME=1421>&#160;</A>
Unstructured communication patterns do not generally cause conceptual
difficulties in the early stages of a design.  For example, it is
straightforward to define a single task for each vertex in a finite
element graph and to require communication for each edge.  However,
unstructured communication complicates the tasks of agglomeration and
mapping.  In particular, sophisticated algorithms can be required to
determine an agglomeration strategy that both creates tasks of
approximately equal size and minimizes communication requirements by
creating the least number of intertask edges.  Algorithms of this sort
are discussed in Section <A HREF="node19.html#seclbalgs" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node19.html#seclbalgs">2.5.1</A>.  If communication
<A NAME=1423>&#160;</A>
requirements are dynamic, these algorithms must be applied frequently
during program execution, and the cost of these algorithms must be
weighed against their benefits.
<P>

<H2><A NAME=SECTION02334000000000000000>2.3.4 Asynchronous Communication</A></H2>
<P>
<A NAME=seccommas>&#160;</A>
<P>
The examples considered in the preceding section have all featured
synchronous communication, in which both producers and consumers are
aware when communication operations are required, and producers
explicitly send
<A NAME=1426>&#160;</A>
data to consumers.  In <em> asynchronous
 </em> communication, tasks that
<A NAME=1428>&#160;</A>
possess data (producers) are not able to determine when other tasks
(consumers) may require data; hence, consumers must explicitly request
data from producers.
<P>
<P><A NAME=2558>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img207.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img207.gif">
<BR><STRONG>Figure 2.10:</STRONG> <em> Using separate ``data tasks'' to service read and write
requests on a distributed data structure.  In this figure, four
computation tasks (C) generate read and write requests to eight data
items distributed among four data tasks (D).  Solid lines represent
requests; dashed lines represent replies.  One compute task and one
data task could be placed on each of four processors so as to
distribute computation and data equitably.</em><A NAME=figasyncdata>&#160;</A><BR>
<P>
<P>
This situation commonly occurs when a computation is structured as a
set of tasks that must periodically read and/or write elements of a
shared data structure.  Let us assume that this data structure is too
large or too frequently accessed to be encapsulated in a single task.
Hence, a mechanism is needed that allows this data structure to be
distributed while supporting asynchronous read and write operations on
its components.  Possible mechanisms include the following.
<P>
<OL><LI>
<A NAME=1434>&#160;</A>
The data structure is distributed among the computational tasks.  Each
<A NAME=1435>&#160;</A>
task both performs computation and generates requests for data located
in other tasks.  It also periodically interrupts its own computation
and <em> polls
 </em> for pending requests.
<P>
<LI>
The distributed data structure is encapsulated in a second set of
tasks responsible only for responding to read and write requests
(Figure <A HREF="node17.html#figasyncdata" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node17.html#figasyncdata">2.10</A>).
<P>
<LI>
On a computer that supports a shared-memory programming model,
computational tasks can access shared data without any special
arrangements.  However, care must be taken to ensure that read and
write operations on this shared data occur in the proper order.
<P>
</OL>
<P>
<A NAME=1439>&#160;</A>
Each strategy has advantages and disadvantages; in addition, the
performance characteristics of each approach vary from machine to
machine.  The first strategy can result in convoluted, nonmodular
programs because of the need to intersperse polling operations 
<A NAME=1440>&#160;</A>
throughout application code.  In addition, polling can be an expensive
operation on some computers, in which case we must trade off the cost
of frequent polling against the benefit of rapid response to remote
requests.  The second strategy is more modular: responsibility for the
shared data structure is encapsulated in a separate set of tasks.
However, this strategy makes it hard to exploit locality because,
strictly speaking, there are no local data: all read and write
operations require communication.  Also, switching between the
computation and data tasks can be expensive on some machines.
<P>

<H2><A NAME=SECTION02335000000000000000>2.3.5 Communication Design Checklist</A></H2>
<P>
<A NAME=secrules2>&#160;</A>
<P>
<A NAME=1443>&#160;</A>
Having devised a partition and a communication structure for our
parallel algorithm, we now evaluate our design using the following
design checklist.  As in Section <A HREF="node16.html#secrules1" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node16.html#secrules1">2.2.3</A>, these are
guidelines intended to identify nonscalable features, rather than hard
and fast rules.  However, we should be aware of when a design violates
them and why.
<A NAME=1445>&#160;</A>
<P>
<OL><LI>
Do all tasks perform about the same number of communication
operations?  Unbalanced communication requirements suggest a
nonscalable construct.  Revisit your design to see whether
communication operations can be distributed more equitably.  For
example, if a frequently accessed data structure is encapsulated in a
single task, consider distributing or replicating this data structure.
<P>
<LI>
Does each task communicate only with a small number of neighbors?  If
each task must communicate with many other tasks, evaluate the
possibility of formulating this global communication in terms of a
local communication structure, as was done in the pairwise
interactions algorithm of Section <A HREF="node10.html#exinteractions" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node10.html#exinteractions">1.4.2</A> and the
summation algorithm of Section <A HREF="node17.html#seccommglobal" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node17.html#seccommglobal">2.3.2</A>.
<P>
<LI>
Are communication operations able to proceed concurrently?  If not,
your algorithm is likely to be inefficient and nonscalable.  Try to
use divide-and-conquer techniques to uncover concurrency, as in the
summation algorithm of Section <A HREF="node17.html#seccommglobal" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node17.html#seccommglobal">2.3.2</A>.
<P>
<LI>
Is the computation associated with different tasks able to proceed
concurrently?  If not, your algorithm is likely to be inefficient and
nonscalable.  Consider whether you can reorder communication and
computation operations.  You may also wish to revisit your problem
specification, as was done in moving from a simple Gauss-Seidel to a
red-black algorithm in Section <A HREF="node17.html#seclocal" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node17.html#seclocal">2.3.1</A>.
</OL>
<P>

<BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html2047 HREF="node16.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node16.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html2055 HREF="node18.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node18.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html2053 HREF="node14.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node14.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html2057 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html2058 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html2056 HREF="node18.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node18.html">2.4 Agglomeration</A>
<B>Up:</B> <A NAME=tex2html2054 HREF="node14.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node14.html">2 Designing Parallel Algorithms</A>
<B> Previous:</B> <A NAME=tex2html2048 HREF="node16.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node16.html">2.2 Partitioning</A>
<BR><HR><P>
<P><ADDRESS>
<I>&#169 Copyright 1995 by <A href="msgs0.htm#6" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#6">Ian Foster</a></I>
</ADDRESS>
</BODY>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -