📄 cl-manual.html
字号:
<html xmlns:cf="http://docbook.sourceforge.net/xmlns/chunkfast/1.0"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>5.燙allgrind: a heavyweight profiler</title><link rel="stylesheet" href="vg_basic.css" type="text/css"><meta name="generator" content="DocBook XSL Stylesheets V1.69.0"><link rel="start" href="index.html" title="Valgrind Documentation"><link rel="up" href="manual.html" title="Valgrind User Manual"><link rel="prev" href="cg-manual.html" title="4.燙achegrind: a cache profiler"><link rel="next" href="ms-manual.html" title="6.燤assif: a heap profiler"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr><td width="22px" align="center" valign="middle"><a accesskey="p" href="cg-manual.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td><td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td><td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td><th align="center" valign="middle">Valgrind User Manual</th><td width="22px" align="center" valign="middle"><a accesskey="n" href="ms-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td></tr></table></div><div class="chapter" lang="en"><div class="titlepage"><div><div><h2 class="title"><a name="cl-manual"></a>5.燙allgrind: a heavyweight profiler</h2></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="cl-manual.html#cl-manual.use">5.1. Overview</a></span></dt><dt><span class="sect1"><a href="cl-manual.html#cl-manual.purpose">5.2. Purpose</a></span></dt><dd><dl><dt><span class="sect2"><a href="cl-manual.html#cl-manual.devel">5.2.1. Profiling as part of Application Development</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.tools">5.2.2. Profiling Tools</a></span></dt></dl></dd><dt><span class="sect1"><a href="cl-manual.html#cl-manual.usage">5.3. Usage</a></span></dt><dd><dl><dt><span class="sect2"><a href="cl-manual.html#cl-manual.basics">5.3.1. Basics</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.dumps">5.3.2. Multiple profiling dumps from one program run</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.limits">5.3.3. Limiting the range of collected events</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.cycles">5.3.4. Avoiding cycles</a></span></dt></dl></dd><dt><span class="sect1"><a href="cl-manual.html#cl-manual.options">5.4. Command line option reference</a></span></dt><dd><dl><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.misc">5.4.1. Miscellaneous options</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.creation">5.4.2. Dump creation options</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.activity">5.4.3. Activity options</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.collection">5.4.4. Data collection options</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.separation">5.4.5. Cost entity separation options</a></span></dt><dt><span class="sect2"><a href="cl-manual.html#cl-manual.options.simulation">5.4.6. Cache simulation options</a></span></dt></dl></dd></dl></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cl-manual.use"></a>5.1.燨verview</h2></div></div></div><p>Callgrind is a Valgrind tool for profiling programs.The collected data consists ofthe number of instructions executed on a run, their relationshipto source lines, andcall relationship among functions together with call counts.Optionally, a cache simulator (similar to cachegrind) can producefurther information about the memory access behavior of the application.</p><p>The profile data is written out to a file at programtermination. For presentation of the data, and interactive controlof the profiling, two command line tools are provided:</p><div class="variablelist"><dl><dt><span class="term"><span><strong class="command">callgrind_annotate</strong></span></span></dt><dd><p>This command reads in the profile data, and prints a sorted lists of functions, optionally with annotation.</p><p>For graphical visualization of the data, check out <a href="http://kcachegrind.sourceforge.net/cgi-bin/show.cgi/KcacheGrindIndex" target="_top">KCachegrind</a>.</p></dd><dt><span class="term"><span><strong class="command">callgrind_control</strong></span></span></dt><dd><p>This command enables you to interactively observe and control the status of currently running applications, without stopping the application. You can get statistics information, the current stack trace, and request zeroing of counters, and dumping of profiles data.</p></dd></dl></div><p>To use Callgrind, you must specify <code class="computeroutput">--tool=callgrind</code> on the Valgrind command line or use the supplied script <code class="computeroutput">callgrind</code>.</p><p>Callgrind's cache simulation is based on the <a href="http://www.valgrind.org/info/tools.html#cachegrind" target="_top">Cachegrind tool</a> of the <a href="http://www.valgrind.org/" target="_top">Valgrind</a> package. Read <a href="http://www.valgrind.org/docs/manual/cg-manual.html" target="_top">Cachegrind's documentation</a> first; this page describes the features supported in addition to Cachegrind's features.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cl-manual.purpose"></a>5.2.燩urpose</h2></div></div></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="cl-manual.devel"></a>5.2.1.燩rofiling as part of Application Development</h3></div></div></div><p>With application development, a common step is to improve runtime performance. To not waste time on optimizing functions which are rarely used, one needs to know in which parts of the program most of the time is spent.</p><p>This is done with a technique called profiling. The program is run under control of a profiling tool, which gives the time distribution of executed functions in the run. After examination of the program's profile, it should be clear if and where optimization is useful. Afterwards, one should verify any runtime changes by another profile run.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="cl-manual.tools"></a>5.2.2.燩rofiling Tools</h3></div></div></div><p>Most widely known is the GCC profiling tool <span><strong class="command">GProf</strong></span>: one needs to compile an application with the compiler option <code class="computeroutput">-pg</code>. Running the program generates a file <code class="computeroutput">gmon.out</code>, which can be transformed into human readable form with the command line tool <code class="computeroutput">gprof</code>. A disadvantage here is the the need to recompile everything, and also the need to statically link the executable.</p><p>Another profiling tool is <span><strong class="command">Cachegrind</strong></span>, part of <a href="http://www.valgrind.org/" target="_top">Valgrind</a>. It uses the processor emulation of Valgrind to run the executable, and catches all memory accesses, which are used to drive a cache simulator. The program does not need to be recompiled, it can use shared libraries and plugins, and the profile measurement doesn't influence the memory access behaviour. The trace includes the number of instruction/data memory accesses and 1st/2nd level cache misses, and relates it to source lines and functions of the run program. A disadvantage is the slowdown involved in the processor emulation, around 50 times slower.</p><p>Cachegrind can only deliver a flat profile. There is no call relationship among the functions of an application stored. Thus, inclusive costs, i.e. costs of a function including the cost of all functions called from there, cannot be calculated. Callgrind extends Cachegrind by including call relationship and exact event counts spent while doing a call.</p><p>Because Callgrind (and Cachegrind) is based on simulation, the slowdown due to processing the synthetic runtime events does not influence the results. See <a href="cl-manual.html#cl-manual.usage">Usage</a> for more details on the possibilities.</p></div></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cl-manual.usage"></a>5.3.燯sage</h2></div></div></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="cl-manual.basics"></a>5.3.1.燘asics</h3></div></div></div><p>To start a profile run for a program, execute: </p><pre class="screen">callgrind [callgrind options] your-program [program options]</pre><p> </p><p>While the simulation is running, you can observe execution with </p><pre class="screen">callgrind_control -b</pre><p> This will print out a current backtrace. To annotate the backtrace with event counts, run </p><pre class="screen">callgrind_control -e -b</pre><p> </p><p>After program termination, a profile data file named <code class="computeroutput">callgrind.out.pid</code> is generated with <span class="emphasis"><em>pid</em></span> being the process ID of the execution of this profile run.</p><p>The data file contains information about the calls made in the program among the functions executed, together with events of type <span><strong class="command">Instruction Read Accesses</strong></span> (Ir).</p><p>If you are additionally interested in measuring the cache behaviour of your program, use Callgrind with the option <code class="option"><a href="cl-manual.html#opt.simulate-cache">--simulate-cache</a>=yes.</code> This will further slow down the run approximately by a factor of 2.</p><p>If the program section you want to profile is somewhere in the middle of the run, it is beneficial to <span class="emphasis"><em>fast forward</em></span> to this section without any profiling at all, and switch it on later. This is achieved by using <code class="option"><a href="cl-manual.html#opt.instr-atstart">--instr-atstart</a>=no</code> and interactively use <code class="computeroutput">callgrind_control -i on</code> before the interesting code section is about to be executed.</p><p>If you want to be able to see assembler annotation, specify <code class="option"><a href="cl-manual.html#opt.dump-instr">--dump-instr</a>=yes</code>. This will produce profile data at instruction granularity. Note that the resulting profile data can only be viewed with KCachegrind. For assembler annotation, it also is interesting to see more details of the control flow inside of functions, ie. (conditional) jumps. This will be collected by further specifying <code class="option"><a href="cl-manual.html#opt.collect-jumps">--collect-jumps=</a>=yes</code>.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="cl-manual.dumps"></a>5.3.2.燤ultiple profiling dumps from one program run</h3></div></div></div><p>Often, you aren't interested in time characteristics of a full program run, but only of a small part of it (e.g. execution of one algorithm). If there are multiple algorithms or one algorithm running with different input data, it's even useful to get different profile information for multiple parts of one program run.</p><p>Profile data files have names of the form</p><pre class="screen">callgrind.out.<span class="emphasis"><em>pid</em></span>.<span class="emphasis"><em>part</em></span>-<span class="emphasis"><em>threadID</em></span></pre><p> </p><p>where <span class="emphasis"><em>pid</em></span> is the PID of the running program, <span class="emphasis"><em>part</em></span> is a number incremented on each dump (".part" is skipped for the dump at program termination), and <span class="emphasis"><em>threadID</em></span> is a thread identification ("-threadID" is only used if you request dumps of individual threads with <code class="option"><a href="cl-manual.html#opt.separate-threads">--separate-threads</a>=yes</code>).</p><p>There are different ways to generate multiple profile dumps while a program is running under Callgrind's supervision. Nevertheless, all methods trigger the same action, which is "dump all profile information since the last dump or program start, and zero cost counters afterwards". To allow for zeroing cost counters without dumping, there is a second action "zero all cost counters now". The different methods are:</p><div class="itemizedlist"><ul type="disc"><li><p><span><strong class="command">Dump on program termination.</strong></span> This method is the standard way and doesn't need any special action from your side.</p></li><li><p><span><strong class="command">Spontaneous, interactive dumping.</strong></span> Use </p><pre class="screen">callgrind_control -d [hint [PID/Name]]</pre><p> to request the dumping of profile information of the supervised application with PID or Name. <span class="emphasis"><em>hint</em></span> is an arbitrary string you can optionally specify to later be able to distinguish profile dumps. The control program will not terminate before the dump is completely written. Note that the application must be actively running for detection of the dump command. So, for a GUI application, resize the window or for a server send a request.</p><p>If you are using <a href="http://kcachegrind.sourceforge.net/cgi-bin/show.cgi/KcacheGrindIndex" target="_top">KCachegrind</a> for browsing of profile information, you can use the toolbar button <span><strong class="command">Force dump</strong></span>. This will request a dump and trigger a reload after the dump is written.</p></li><li><p><span><strong class="command">Periodic dumping after execution of a specified
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -