📄 node83.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 2 页
字号:
12 下一页
<html><!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
<!Converted with LaTeX2HTML 95.1 (Fri Jan 20 1995) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds >
<HEAD>
<TITLE>7.1 Data Parallelism</TITLE>
</HEAD>
<BODY>
<meta name="description" value="7.1 Data Parallelism">
<meta name="keywords" value="book">
<meta name="resource-type" value="document">
<meta name="distribution" value="global">
<P>
 <BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html2952 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html2960 HREF="node84.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node84.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html2958 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html2962 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html2963 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html2961 HREF="node84.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node84.html">7.2 Fortran 90</A>
<B>Up:</B> <A NAME=tex2html2959 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html">7 High Performance Fortran</A>
<B> Previous:</B> <A NAME=tex2html2953 HREF="node82.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node82.html">7 High Performance Fortran</A>
<BR><HR><P>
<H1><A NAME=SECTION03410000000000000000>7.1 Data Parallelism</A></H1>
<P>
We first provide a general introduction to data parallelism and
data-parallel languages, focusing on concurrency, locality, and
algorithm design.
<P>
<H2><A NAME=SECTION03411000000000000000>7.1.1 Concurrency</A></H2>
<P>
<A NAME=10592>&#160;</A>
Depending on the programming language used, the data ensembles
operated on in a data-parallel program may be regular (e.g., an array)
or irregular (e.g., a tree or sparse matrix).  In F90 and HPF, the
data structures operated on are arrays.  In contrast, the
<A NAME=10593>&#160;</A>
data-parallel language pC++
  allows programs to operate not only on
arrays but also on trees, sets, and other more complex data
structures.
<P>
<A NAME=10594>&#160;</A>
Concurrency may be implicit or may be expressed by using explicit
<A NAME=10595>&#160;</A>
parallel constructs.  For example, the F90 array assignment statement
is an <em> explicitly
 </em> parallel construct; we write
<PRE><TT> 
		<tt> A = B*C</tt>          		 ! <tt> A</tt>, <tt> B</tt>, <tt> C</tt> are arrays
<P>
</TT></PRE>
to specify that each element of array <tt> A</tt> is to be assigned the
product of the corresponding elements of arrays <tt> B</tt> and <tt> C</tt>.
<A NAME=10606>&#160;</A>
This statement also implies <em> conformality
 </em>; that is, the three
<A NAME=10608>&#160;</A>
arrays have the same size and shape. 
In contrast, the following do-loop
<A NAME=10609>&#160;</A>
is <em> implicitly
 </em> parallel: a compiler may be able to detect
that the various do-loop iterations are independent (meaning that one
iteration does not write a variable read or written by another) and
hence can be performed in parallel, but this detection requires some
analysis.
<P>

<PRE><TT> 
		<tt> do i = 1,m</tt>
<P>
				<tt> do j = 1,n</tt>
<P>
						<tt> A(i,j) = B(i,j)*C(i,j)</tt>
<P>
				<tt> enddo</tt>
<P>
		<tt> enddo</tt>
<P>
</TT></PRE>

<P>
A data-parallel program is a sequence of explicitly and implicitly
parallel statements.  On a distributed-memory parallel computer,
compilation typically translates these statements into an SPMD
<A NAME=10618>&#160;</A>
program (Section <A HREF="node9.html#sec1other" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node9.html#sec1other">1.3.2</A>), in which each processor executes
the same code on a subset of the data structures.  In many cases, the
compiler can construct this program by first partitioning data
structures into disjoint subdomains, one per processor, and then
<A NAME=10620>&#160;</A>
applying the owner computes rule to determine which operations should
be performed on each processor.  This rule states that the computation
required to produce a datum is performed on the processor on which
that datum is located.  However, compilers can also use different
techniques.
<P>
Compilation also introduces communication operations when computation
mapped to one processor requires data mapped to another processor.  As
an example, consider the following program.
<P>

<PRE><TT> 
		<tt> real y, s, X(100)</tt>  				 ! <tt> y</tt>, <tt> s</tt> scalars; <tt> X</tt> an array
<P>
		<tt> X = X*y</tt>             				 ! Multiply each <tt> X(i)</tt> by <tt> y</tt>
<P>
		<tt> do i = 2,99</tt>
<P>
				<tt> X(i) = (X(i-1) + X(i+1))/2</tt> 		 ! Communication required
<P>
		<tt> enddo</tt>
<P>
		<tt> s = SUM(X)</tt>          				 ! Communication required
<P>
</TT></PRE>

<P>
The communication requirements of this program depend on how the
variables <tt> X</tt>, <tt> y</tt>, and <tt> s</tt> are distributed over
processors.  If <tt> X</tt> is distributed while <tt> y</tt> and <tt> s</tt> are
replicated, then the first assignment can proceed without
communication, with each <tt> X(i)</tt> being computed by the processor
that owns <tt> X(i)</tt>.  The second assignment (in the do-loop) requires
communication: the processor computing <tt> X(i)</tt> requires the values
of <tt> X(i-1)</tt> and <tt> X(i+1)</tt>, which may be located on different
processors.  The summation also requires communication.
<P>
<H2><A NAME=SECTION03412000000000000000>7.1.2 Locality</A></H2>
<P>
<A NAME=10646>&#160;</A>
Data placement is an essential part of a data-parallel algorithm, since
the mapping of data to processors determines the <em> locality
 </em>
of data references and hence, to a large extent, the performance of a
parallel program.  For example, the simple array assignment <tt>
A = B*C</tt> either can proceed without any communication or can require
communication for every assignment, depending on whether corresponding
elements of the arrays <tt> A</tt>, <tt> B</tt>, and <tt> C</tt> are located on
the same or different processors.
<P>
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -