⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 node110.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 2 页
字号:
<html><!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
<!Converted with LaTeX2HTML 95.1 (Fri Jan 20 1995) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds >
<HEAD>
<TITLE>9.4 Tools</TITLE>
</HEAD>
<BODY>
<meta name="description" value="9.4 Tools">
<meta name="keywords" value="book">
<meta name="resource-type" value="document">
<meta name="distribution" value="global">
<P>
 <BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html3288 HREF="node109.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node109.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html3296 HREF="node111.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node111.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html3294 HREF="node106.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node106.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html3298 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html3299 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html3297 HREF="node111.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node111.html">9.5 Summary</A>
<B>Up:</B> <A NAME=tex2html3295 HREF="node106.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node106.html">9 Performance Tools</A>
<B> Previous:</B> <A NAME=tex2html3289 HREF="node109.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node109.html">9.3 Data Transformation and Visualization</A>
<BR><HR><P>
<H1><A NAME=SECTION03640000000000000000>9.4 Tools</A></H1>
<P>
<A NAME=secpttools>&#160;</A>
<P>
Next, we describe a number of both public-domain and commercial
performance tools, explaining how each is used to collect and display
performance data.  While the tools exhibit important differences,
there are also many similarities, and frequently our choice of tool
will be driven more by availability than by the features provided.
<P>
<H2><A NAME=SECTION03641000000000000000>9.4.1 Paragraph</A></H2>
<P>
<A NAME=14430>&#160;</A>
Paragraph is a portable trace analysis and visualization package
developed at Oak Ridge National Laboratory for message-passing
programs.  It was originally developed to analyze traces generated by
a message-passing library called the Portable Instrumented
Communication Library (PICL) but can in principle be used to examine
any trace that complies to its format.  Like many message-passing
systems, PICL can be instructed to generate execution traces
automatically, without programmer intervention.
<P>
Paragraph is an interactive tool.  Having specified a trace file, the
user instructs Paragraph to construct various displays concerning processor
utilization, communication, and the like.  The trace files consumed by
Paragraph include, by default, time-stamped events for every
communication operation performed by a parallel program.  Paragraph
performs on-the-fly data reduction to generate the required images.
Users also can record events that log the start and end of
user-defined ``tasks.''
<P>
Paragraph's <em> processor utilization
 </em> displays allow the user
<A NAME=14432>&#160;</A>
to distinguish time spent computing, communicating, and idling.
Communication time represents time spent in system communication
routines, while idle time represents time spent waiting for messages.
These displays can be used to identify load imbalances and code
components that suffer from excessive communication and idle time
costs.  Some of these displays are shown in
<A HREF="#paragraph1">Plate 8</A>,

<P>
which shows a Gantt chart (top part) and a space time diagram (bottom
part) for a parallel climate model executing on 64 Intel DELTA
<A NAME=14437>&#160;</A>
processors.  In the space-time diagram, the color of the lines
representing communications indicates the size of the message being
transferred.  The climate model is a complex program with multiple
phases.  Initially, only processor 0 is active.  Subsequently, the
model alternates between computation and communication phases. Some of
the communication phases involve substantial idle time, which should
be the subject of further investigation.
<P>
<em> Communication
 </em> displays can be used both to obtain more
detailed information on communication volumes and communication
patterns and to study causal relationships, for example between
communication patterns and idle time.
<A HREF="#paragraph2">Plate 10</A>

<P>
shows some of these displays, applied here to the trace data set of
<A HREF="#paragraph1">Plate 8</A>.

<P>
The communication matrix on the left and the circle on the right both
show instantaneous communication patterns.  The colors in the
communication matrix indicate communication volume, as defined by the
scale above the matrix.  Most matrix entries are on the diagonal,
which indicates mostly nearest-neighbor communication.  Another display in
the top right presents cumulative data on processor utilization.
<P>

<P>
<P><HR><P>
<EM>Plate 10 is not available in the online version.</EM>
<P><HR>

<P>
A disadvantage of Paragraph is that the relationship between
performance data and program source is not always clear.  This problem
can be overcome in part by explicitly logging events that record the
start and end of ``tasks'' corresponding to different phases of a
program's execution.  Paragraph provides task Gantt and task histogram
<A NAME=14449>&#160;</A>
displays to examine this information.
<P>
Of the portable tools described here, Paragraph is probably the
simplest to install and use.  Because it operates on automatically
generated traces, it can be used with little programmer intervention.
Paragraph displays are particularly intuitive, although the inability
to scroll within display windows can be frustrating.
<P>
<H2><A NAME=SECTION03642000000000000000>9.4.2 Upshot</A></H2>
<P>
<A NAME=secupshot>&#160;</A>
<P>
<A NAME=14452>&#160;</A>
Upshot is a trace analysis and visualization package developed at
<A NAME=14453>&#160;</A>
Argonne National Laboratory for message-passing programs.  It can be
used to analyze traces from a variety of message-passing systems: in
particular, trace events can be generated automatically by using an
instrumented version of MPI.  Alternatively, the programmer can insert
event logging calls manually.
<P>
Upshot's display tools are designed for the visualization and analysis
<A NAME=14454>&#160;</A>
of state data derived from logged events.  A state is defined by a
starting and ending event.  (For example, an instrumented collective
<A NAME=14455>&#160;</A>
communication routine can generate two separate events on each
processor to indicate when the processor entered and exited the
routine.)  The Upshot Gantt chart display shows the state of each
<A NAME=14456>&#160;</A>
processor as a function of time.  States can be nested, thereby allowing
multiple levels of detail to be captured in a single display.  States
can be defined either in an input file or interactively during
visualization.  A histogramming facility allows the use of histograms
to summarize information about state duration
(<A HREF="#upshot">Plate 11</A>).

<P>
<P><HR>
<A NAME=upshot HREF="sam.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/sam.gif"> <img
ALIGN=MIDDLE src="sam_small.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/sam_small.gif"></A>
<P>
(GIF <A HREF="sam.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/sam.gif">24137</A> bytes; RAS <A
HREF="javascript:if(confirm('http://www.dit.hcmut.edu.vn/books/system/par_anl/sam.rgb  \n\nThis file was not retrieved by Teleport Pro, because the server reports that this file cannot be found.  \n\nDo you want to open it from the server?'))window.location='http://www.dit.hcmut.edu.vn/books/system/par_anl/sam.rgb'" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/sam.rgb">729587</A> bytes.)
Plate 11: Gantt chart, state duration histogram, and
instantaneous state diagram for a search problem running on 16
processors, generated using Upshot.  Image courtesy of E. Lusk.
<P><HR>

<P>
<A HREF="#tilson">Plate 12</A>

<P>
illustrates the use of nested states within Upshot.  This is a trace
<A NAME=14467>&#160;</A>
generated from a computational chemistry code that alternates between
Fock matrix construction (Section <A HREF="node22.html#secchem" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node22.html#secchem">2.8</A>) and matrix
<A NAME=14469>&#160;</A>
diagonalization, with the former taking most of the time.  Each Fock
matrix construction operation (blue) involves multiple integral
computations (green).  A substantial load imbalance is apparent---some
processors complete their final set of integrals much later than do
others.  The display makes it apparent why this load imbalance occurs.
Integrals are being allocated in a demand-driven fashion by a central
scheduler to ensure equitable distribution of work; however, smaller
integrals are being allocated before larger ones.  Reversing the
allocation order improves performance.
<P>

<P>
<P><HR><P>
<EM>Plate 12 is not available in the online version.</EM>
<P><HR>

<P>
Upshot provides fewer displays than does Paragraph, but has
some nice features.  The ability to scroll and zoom its displays is
particularly useful.
<P>
<H2><A NAME=SECTION03643000000000000000>9.4.3 Pablo</A></H2>
<P>
The Pablo system developed at the University of Illinois is the most
ambitious (and complex) of the performance tools described here.  It
<A NAME=14473>&#160;</A>
provides a variety of mechanisms for collecting, transforming, and
visualizing data and is designed to be extensible, so that the
programmer can incorporate new data formats, data collection
mechanisms, data reduction modules, and displays.  Predefined and
user-defined data reduction modules and displays can be combined in a
mix-and-match fashion by using a graphical editor.  Pablo is as much a
performance tool toolkit as it is a performance tool proper and has
been used to develop performance tools for both message-passing and
data-parallel programs.
<P>
A source code instrumentation interface facilitates the insertion of

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -