📄 node29.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
<html><!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
<!Converted with LaTeX2HTML 95.1 (Fri Jan 20 1995) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds >
<HEAD>
<TITLE>3.3 Developing Models</TITLE>
</HEAD>
<BODY>
<meta name="description" value="3.3 Developing Models">
<meta name="keywords" value="book">
<meta name="resource-type" value="document">
<meta name="distribution" value="global">
<P>
 <BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html2201 HREF="node28.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node28.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html2209 HREF="node30.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node30.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html2207 HREF="node26.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node26.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html2211 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html2212 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html2210 HREF="node30.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node30.html">3.4 Scalability Analysis</A>
<B>Up:</B> <A NAME=tex2html2208 HREF="node26.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node26.html">3 A Quantitative Basis for Design</A>
<B> Previous:</B> <A NAME=tex2html2202 HREF="node28.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node28.html">3.2 Approaches to Performance Modeling</A>
<BR><HR><P>
<H1><A NAME=SECTION02430000000000000000>3.3 Developing Models</A></H1>
<P>
<A NAME=3157>&#160;</A>
A good performance model, like a good scientific theory, is able to
explain available observations and predict future circumstances, while
abstracting unimportant details.  Amdahl's law, empirical
observations, and asymptotic analysis do not satisfy the first of
these requirements.  On the other hand, conventional computer system
modeling techniques, which typically involve detailed simulations of
individual hardware components, introduce too many details to be of
practical use to parallel programmers.  In the rest of this chapter,
we introduce performance modeling techniques that provide an
intermediate level of detail.  These techniques are certainly not
appropriate for all purposes: they are specialized for the
multicomputer architecture and do not take into account, for example,
cache behavior.  However, they have been proven useful in a wide range
of parallel algorithm design problems.  The chapter notes provide
references to other approaches.
<P>
The performance models considered here specify a metric such as
<A NAME=3158>&#160;</A>
execution time <em> T</em>
 as a function of problem size <em> N</em>
, number
<A NAME=3161>&#160;</A>
of processors <em> P</em>
, number of tasks <em> U</em>
, and other algorithm
and hardware characteristics:
<P><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img378.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img378.gif"><P>
<P>
We define the <em> execution time
 </em> of a parallel program as the time
that elapses from when the first processor starts executing on the
<A NAME=3167>&#160;</A>
problem to when the last processor completes execution.  This
definition is not entirely adequate for a timeshared parallel computer
but suffices for our purposes.  During execution, each processor is
computing, communicating, or idling, as illustrated in
Figure <A HREF="node29.html#figeff" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node29.html#figeff">3.2</A>.  <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img379.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img379.gif">, <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img380.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img380.gif">, and <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img381.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img381.gif"> are the time spent computing, communicating, and idling,
respectively, on the <em> i</em>
th processor.  Hence, total execution time
<em> T</em>
 can be defined in two ways: as the sum of computation,
communication, and idle times on an arbitrary processor <em> j</em>
,
<P>
<P><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img382.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img382.gif"><P>
or as the sum of these times over all processors divided by the
number of processors <em> P</em>
,
<P>
<P><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img383.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img383.gif"><P>
<P>
<P><A NAME=4559>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img384.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img384.gif">
<BR><STRONG>Figure 3.2:</STRONG> <em> Activity plot during execution of a parallel program on eight
processors.  Each processor spends its time computing, communicating,
or idling.  <em> T</em>
 is the total execution time.</em><A NAME=figeff>&#160;</A><BR>
<P>
<P>
The latter definition is often more useful, since it is typically easier
to determine the total computation and communication performed by a
parallel algorithm rather than the time spent computing and
communicating on individual processors.
<P>
Thus, the goal is to develop mathematical expressions that specify
execution time as functions of <em> N</em>
, <em> P</em>
, etc.  These models
should be as simple as possible, while providing acceptable accuracy.
We use the following techniques to reduce model complexity.
<P>
<UL><LI>
We base our model on the idealized <i> multicomputer
 </i> parallel
architecture model introduced in Chapter <A HREF="node6.html#chap1" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node6.html#chap1">1</A>.  Low-level
hardware details such as memory hierarchies and the topology of the
interconnection network are introduced only if there is evidence to
suggest that they are important.  (The latter issue is discussed in
Section <A HREF="node33.html#secperfrefine" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node33.html#secperfrefine">3.7</A>.)
<P>
<LI> 
<A NAME=3209>&#160;</A>
We use <em> scale analysis
 </em> to identify insignificant effects
that can be ignored in the analysis.  For example, if an algorithm
consists of an initialization step followed by several thousand
iterations of a computation step, then unless initialization is very
expensive we consider only the computation step in our analysis.
<P>
<LI> 
We use <em> empirical studies
 </em> to calibrate simple models rather
than developing more complex models from first principles.
</UL><H2><A NAME=SECTION02431000000000000000>3.3.1 Execution Time</A></H2>
<P>
We first examine the three components of total execution time:
computation time, communication time, and idle time.
<P>
<H4><A NAME=SECTION02431010000000000000> Computation Time.</A></H4>
<P>
The <i> computation time
 </i> of an algorithm (<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img385.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img385.gif"> is the time
<A NAME=3217>&#160;</A>
spent performing computation rather than communicating or idling.  If
we have a sequential program that performs the same computation as the
parallel algorithm, we can determine <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img386.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img386.gif"> by timing that
program.  Otherwise, we may have to implement key kernels.
<P>
Computation time will normally depend on some measure of problem size,
whether that size is represented by a single parameter <em> N</em>
 or by a set of
parameters, <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img387.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img387.gif">, <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img388.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img388.gif">, ..., <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img389.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img389.gif">.  If the parallel algorithm
replicates computation, then computation time will also depend on the
number of tasks or processors.  In a heterogeneous parallel computer
(such as a workstation network), computation time can vary according
to the processor on which computation is performed.
<P>
Computation time will also depend on characteristics of processors and
their memory systems.  For example, scaling problem size or number of
processors can change cache performance or the effectiveness of
processor pipelining.  As a consequence, one cannot automatically
assume that total computation time will stay constant as the number of
processors changes.
<P>
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -