📄 node89.html
字号:
from entire arrays (Table <A HREF="node84.html#tabf90intrinsics" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node84.html#tabf90intrinsics">7.1</A>) and hence involve
considerable communication if the arrays to which they are applied are
distributed. For example, operations such as <tt> MAXVAL</tt> and <tt>
SUM</tt> perform array reductions which, as noted in Chapter <A HREF="node14.html#chap2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node14.html#chap2">2</A>,
can be performed in <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img986.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img986.gif"> steps on <em> P</em>
processors, for a total
communication cost of <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img987.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img987.gif">. This cost is independent
of the size of the arrays to be reduced. In contrast, the cost of a
<tt> TRANSPOSE</tt> or <tt> MATMUL</tt> operation depends on both the size and
distribution of the operation's arguments. Other operations such as
<tt> DOT_PRODUCT</tt> involve communication only if their arguments are
not aligned.
<P>
<em> Array operations.</em> Array assignments and <tt> FORALL</tt> statements
<A NAME=11440> </A>
can result in communication if, in order to compute some array element <tt>
<A NAME=11441> </A>
A(i)</tt>, they require data values (e.g., <tt> B(j)</tt>) that are not on the
same processor. Program <A HREF="node89.html#fighpf2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node89.html#fighpf2">7.6</A> showed one example:
the references <tt> X(i,j-1)</tt> and <tt> X(i,j+1)</tt> resulted in
<A NAME=11446> </A>
communication. The <tt> CSHIFT</tt> operation is another common source of
communication.
<P>
<A NAME=11448> </A>
Cyclic distributions will often result in more communication than
<A NAME=11449> </A>
will block distributions. However, by scattering the computational grid
<A NAME=11450> </A>
over available processors, they can produce better load balance in
some applications. (Recall that this strategy was discussed in
Section <A HREF="node19.html#seclbalgs" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node19.html#seclbalgs">2.5.1</A>.)
<P>
<P><A NAME=12071> </A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img988.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img988.gif">
<BR><STRONG>Figure 7.9:</STRONG> <em> Communication requirements of various <tt> FORALL</tt>
statements. The arrays <tt> A</tt> and <tt> B</tt> are aligned and distributed
in a blocked fashion on three processors, while the array <tt> C</tt> is
distributed in a cyclic fashion.</em><A NAME=fighpfx> </A><BR>
<P>
<P>
To help you develop intuition regarding communication costs,
we present in
Figure <A HREF="node89.html#fighpfx" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node89.html#fighpfx">7.9</A> the communication requirements
associated with a number of different <tt> FORALL</tt> statements
for three arrays <tt> A</tt>, <tt> B</tt>, and <tt> C</tt> distributed as
follows.
<PRE><TT>
<tt> !HPF$</tt> <tt> PROCESSORS pr(3)</tt>
<P>
<tt> integer A(8), B(8), C(8)</tt>
<P>
<tt> !HPF$</tt> <tt> ALIGN B(:) WITH A(:)</tt>
<P>
<tt> !HPF$</tt> <tt> DISTRIBUTE A(BLOCK) ONTO pr</tt>
<P>
<tt> !HPF$</tt> <tt> DISTRIBUTE C(CYCLIC) ONTO pr</tt>
<P>
</TT></PRE>
<P>
<em> Different mappings.</em> Even simple operations performed on
nonaligned arrays can result in communication. For example, the
assignment <tt> A=B</tt> can require considerable communication if arrays
<A NAME=11477> </A>
<tt> A</tt> and <tt> B</tt> have different distributions. The cost of this
sort of communication must be weighed against the cost of converting
to a common distribution before performing the operation.
<P>
<em> Procedure boundaries.</em> As discussed in Sections <A HREF="node41.html#secmoddd" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node41.html#secmoddd">4.2.1</A>
<A NAME=11483> </A>
and <A HREF="node87.html#sechpfmod" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node87.html#sechpfmod">7.5</A>, switching from one decomposition of an array to
another at a procedure boundary can result in considerable
communication. Although the precise amount of communication required
depends on the decomposition, the total cost summed over
<em> P</em>
processors of moving between decompositions of an
<em> M</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img989.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img989.gif"><em> N</em>
array will often be approximately <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img990.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img990.gif"><em> S M N</em>
, where <em> S</em>
is the size of an array element in
four-byte words. This cost arises because, generally, each of the
<em> P</em>
processors must communicate with every other processor,
and each <em> M.N</em>
array element must be communicated.
<P>
<em> Compiler optimization.</em> A good HPF compiler does not compile a
program statement by
<A NAME=11494> </A>
statement. Instead, it seeks to reduce communication costs by
combining communication operations and otherwise reorganizing program
statements. In addition, it may choose to use data distributions
different from those recommended by the programmer. Hence, it is
always necessary to verify analytic results using instrumentation
data.
<P>
<BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a> <A NAME=tex2html3024 HREF="node88.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node88.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html3032 HREF="node90.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node90.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html3030 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html3034 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html3035 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a> <BR>
<B> Next:</B> <A NAME=tex2html3033 HREF="node90.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node90.html">7.8 Case Study: Gaussian Elimination</A>
<B>Up:</B> <A NAME=tex2html3031 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html">7 High Performance Fortran</A>
<B> Previous:</B> <A NAME=tex2html3025 HREF="node88.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node88.html">7.6 Other HPF Features</A>
<BR><HR><P>
<P><ADDRESS>
<I>© Copyright 1995 by <A href="msgs0.htm#6" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#6">Ian Foster</a></I>
</ADDRESS>
</BODY>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -