⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 node108.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 2 页
字号:
<html><!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
<!Converted with LaTeX2HTML 95.1 (Fri Jan 20 1995) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds >
<HEAD>
<TITLE>9.2 Data Collection</TITLE>
</HEAD>
<BODY>
<meta name="description" value="9.2 Data Collection">
<meta name="keywords" value="book">
<meta name="resource-type" value="document">
<meta name="distribution" value="global">
<P>
 <BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html3264 HREF="node107.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node107.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html3272 HREF="node109.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node109.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html3270 HREF="node106.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node106.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html3274 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html3275 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html3273 HREF="node109.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node109.html">9.3 Data Transformation and Visualization</A>
<B>Up:</B> <A NAME=tex2html3271 HREF="node106.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node106.html">9 Performance Tools</A>
<B> Previous:</B> <A NAME=tex2html3265 HREF="node107.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node107.html">9.1 Performance Analysis</A>
<BR><HR><P>
<H1><A NAME=SECTION03620000000000000000>9.2 Data Collection</A></H1>
<P>
<A NAME=14308>&#160;</A>
<P>
Next, we examine in  more detail the
techniques used
to collect performance data.  We consider in turn profiling, counters,
and event traces, focusing in each case on the principles involved.
Individual tools are described in Section <A HREF="node110.html#secpttools" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node110.html#secpttools">9.4</A>.
<P>
<H2><A NAME=SECTION03621000000000000000>9.2.1 Profiles</A></H2>
<P>
The concept of a profile should be familiar from sequential computing.
<A NAME=14311>&#160;</A>
Typically, a profile shows the amount of time spent in
different program components.  This information is often obtained by
sampling techniques, which are simple but not necessarily highly
accurate.  The value of the program counter is determined at fixed
intervals and used to construct a histogram of execution frequencies.
These frequences are then combined with compiler symbol table
information to estimate the amount of time spent in different parts of
a program.  This profile data may be collected on a per-processor
basis and may be able to identify idle time and communication time as
well as execution time.
<P>
<A NAME=14312>&#160;</A>
Profiles have two important advantages.  They can be obtained
automatically, at relatively low cost, and they can provide a
high-level view of program behavior that allows the programmer to
identify problematic program components without generating huge
amounts of data.  (In general, the amount of data associated with a
profile is both small and independent of execution time.)  Therefore,
a profile should be the first technique considered when seeking to
understand the performance of a parallel program.
<P>
A profile can be used in numerous ways.  For example, a single profile
on a moderate number of processors can help identify the program
components that are taking the most time and that hence may require
further investigation.  Similarly, profiles performed for a range of
processor counts and problem sizes can identify components that do not
scale.
<P>
Profiles also have limitations.  In
<A NAME=14313>&#160;</A>
particular, they do not incorporate temporal aspects of program
execution.  For example, consider a program in which every processor
sends to each other processor in turn.  If all processors send to
processor 0, then to processor 1, and so on, overall performance may
be poor.  This behavior would not be revealed in a profile, as every
processor would be shown to communicate the same amount of data.
<P>
Profilers are available on most parallel computers but vary widely in
their functionality and sophistication.  The most basic do little more
than collect sequential profile data on each processor; the most
sophisticated provide various mechanisms for reducing this data,
displaying it, and relating it to source code.  Because efficient
profiling requires the assistance of a compiler and runtime system,
most profiling tools are vendor supplied and machine specific.
<P>
<H2><A NAME=SECTION03622000000000000000>9.2.2 Counters</A></H2>
<P>
As its name suggests, a counter is a storage location that can be
<A NAME=14315>&#160;</A>
incremented each time a specified event occurs.  Counters
<A NAME=14316>&#160;</A>
can be used to record the number of procedure calls, total number of
messages, total message volume, or the number of messages sent between
each pair of processors.  Counts may be generated by
compiler-generated code, by code incorporated in communication
libraries, or by user-inserted calls to counter routines.
<P>
Counters complement profilers by providing information that is not
easily obtainable using sampling techniques.  For example, they
can provide the total number and volume of messages, information that
can be combined with communication time data from a profile to
determine the efficiency of communication operations.
<P>
A useful variant of a counter is an <em> interval timer</em>, a timer used
to determine the length of time spent executing a particular piece of
<A NAME=14318>&#160;</A>
code.  This information can be accumulated in a counter to provide an
accurate determination of the total time spent executing that program
component.  A disadvantage of interval timers is that the logic
required to obtain a timer value can be expensive.
<P>
The use of counters and interval timers in a computational chemistry
code was illustrated in Section <A HREF="node32.html#secperfpt" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node32.html#secperfpt">3.6</A>: see
in particular 
Tables <A HREF="node32.html#tabperfx" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node32.html#tabperfx">3.4</A> and <A HREF="node32.html#tabperfy" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node32.html#tabperfy">3.5</A>.
<P>
<A NAME=14322>&#160;</A>
<H2><A NAME=SECTION03623000000000000000>9.2.3 Traces</A></H2>
<P>
<A NAME=14324>&#160;</A>
<P>
<P><A NAME=14578>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img1065.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img1065.gif">
<BR><STRONG>Figure 9.1:</STRONG>  Trace records generated by the Portable Instrumented
Communication Library.  The various records contain information
regarding the type of event, the processor number involved, a time
stamp, and other information.  Clearly, these records are not meant to
be interpreted by humans.
<A NAME=figpttrace>&#160;</A><BR>
<P>
<P>
An execution trace is the most detailed and low-level approach to
<A NAME=14342>&#160;</A>
performance data collection.  Trace-based systems typically generate

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -