⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 profile.sgml

📁 开放源码实时操作系统源码.
💻 SGML
📖 第 1 页 / 共 2 页
字号:
<!-- DOCTYPE part  PUBLIC "-//OASIS//DTD DocBook V3.1//EN" -->

<!-- {{{ Banner                         -->

<!-- =============================================================== -->
<!--                                                                 -->
<!--     profile.sgml                                                -->
<!--                                                                 -->
<!--     gprof profiling documentation.                              -->
<!--                                                                 -->
<!-- =============================================================== -->
<!-- ####COPYRIGHTBEGIN####                                          -->
<!--                                                                 -->
<!-- =============================================================== -->
<!-- Copyright (C) 2003, 2005 eCosCentric Ltd.                       -->
<!-- This material may be distributed only subject to the terms      -->
<!-- and conditions set forth in the Open Publication License, v1.0  -->
<!-- or later (the latest version is presently available at          -->
<!-- http://www.opencontent.org/openpub/)                            -->
<!-- Distribution of the work or derivative of the work in any       -->
<!-- standard (paper) book form is prohibited unless prior           -->
<!-- permission obtained from the copyright holder                   -->
<!-- =============================================================== -->
<!--                                                                 -->      
<!-- ####COPYRIGHTEND####                                            -->
<!-- =============================================================== -->
<!-- =============================================================== -->
<!-- #####DESCRIPTIONBEGIN####                                       -->
<!--                                                                 -->
<!-- Author(s):   bartv                                              -->
<!-- Date:        2003/09/01                                         -->
<!-- Version:     0.01                                               -->
<!--                                                                 -->
<!-- ####DESCRIPTIONEND####                                          -->
<!-- =============================================================== -->

<!-- }}} -->

<part id="services-profile-gprof"><title>gprof Profiling Support</title> 

<refentry id="gprof">
  <refmeta>
    <refentrytitle>Profiling</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname><varname>CYGPKG_PROFILE_GPROF</varname></refname>
    <refpurpose>eCos Support for the gprof profiling tool</refpurpose>
  </refnamediv>

  <refsect1 id="gprof-description"><title>Description</title>
    <para>
The GNU gprof tool provides profiling support. After a test run it can
be used to find where the application spent most of its time, and that
information can then be used to guide optimization effort. Typical
gprof output will look something like this:
    </para>
    <screen>
Each sample counts as 0.003003 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 14.15      1.45     1.45   120000    12.05    12.05  Proc_7
 11.55      2.63     1.18   120000     9.84     9.84  Func_1
  8.04      3.45     0.82                             main
  7.60      4.22     0.78    40000    19.41    86.75  Proc_1
  6.89      4.93     0.70    40000    17.60    28.99  Proc_6
  6.77      5.62     0.69    40000    17.31    27.14  Func_2
  6.62      6.30     0.68    40000    16.92    16.92  Proc_8
  5.94      6.90     0.61                             strcmp
  5.58      7.47     0.57    40000    14.26    26.31  Proc_3
  5.01      7.99     0.51    40000    12.79    12.79  Proc_4
  4.46      8.44     0.46    40000    11.39    11.39  Func_3
  3.68      8.82     0.38    40000     9.40     9.40  Proc_5
  3.32      9.16     0.34    40000     8.48     8.48  Proc_2
&hellip;
    </screen>
    <para>
This output is known as the flat profile. The data is obtained by
having a hardware timer generate regular interrupts. The interrupt
handler stores the program counter of the interrupted code. gprof
performs a statistical analysis of the resulting data and works out
where the time was spent.
    </para>
    <para>
gprof can also provide information about the call graph, for example:
    </para>
    <screen>
index % time    self  children    called     name
&hellip;
                0.78    2.69   40000/40000       main [1]
[2]     34.0    0.78    2.69   40000         Proc_1 [2]
                0.70    0.46   40000/40000       Proc_6 [5]
                0.57    0.48   40000/40000       Proc_3 [7]
                0.48    0.00   40000/120000      Proc_7 [3]
    </screen>
    <para>
This shows that function <function>Proc_1</function> was called only
from <function>main</function>, and <function>Proc_1</function> in
turn called three other functions. Callgraph information is obtained
only if the application code is compiled with the <option>-pg</option>
option. This causes the compiler to insert extra code into each
compiled function, specifically a call to <function>mcount</function>,
and the implementation of <function>mcount</function> stores away the
data for subsequent processing by gprof.
    </para>
    <caution><para>
There are a number of reasons why the output will not be 100%
accurate. Collecting the flat profile typically involves timer
interrupts so any code that runs with interrupts disabled will not
appear. The current host-side gprof implementation maps program
counter values onto symbols using a bin mechanism. When a bin spans
the end of one function and the start of the next gprof may report the
wrong function. This is especially likely on architectures with
single-byte instructions such as an x86. When examining gprof output
it may prove useful to look at a linker map or program disassembly.
    </para></caution>
    <para>
The eCos profiling package requires some additional support from the
HAL packages, and this may not be available on all platforms:
    </para>
    <orderedlist>
      <listitem><para>
There must be an implementation of the profiling timer. Typically this
is provided by the variant or platform HAL using one of the hardware
timers. If there is no implementation then the configuration tools
will report an unresolved conflict related to
<varname>CYGINT_PROFILE_HAL_TIMER</varname> and profiling is not
possible. Some implementations overload the system clock, which means
that profiling is only possible in configurations containing the eCos
kernel and <varname>CYGVAR_KERNEL_COUNTERS_CLOCK</varname>.
      </para></listitem>
      <listitem><para>
There should be a hardware-specific implementation of
<function>mcount</function>, which in turn will call the generic
functionality provided by this package. It is still possible to do
some profiling without <function>mcount</function> but the resulting
data will be less useful. To check whether or not
<function>mcount</function> is available, look at the current value of
the CDL interface <varname>CYGINT_PROFILE_HAL_MCOUNT</varname> in the
graphical configuration tool or in an <filename>ecos.ecc</filename>
save file.
      </para></listitem>
    </orderedlist>
    <para>
This document only describes the eCos profiling support. Full details
of gprof functionality and output formats can be found in the gprof
documentation. However it should be noted that that documentation
describes some functionality which cannot be implemented using current
versions of the gcc compiler: the section on annotated source listings
is not relevant, and neither are associated command line options like
<option>-A</option> and <option>-y</option>.
    </para>
  </refsect1>

  <refsect1 id="gprof-process"><title>Building Applications for Profiling</title>
    <para>
To perform application profiling the gprof package
<varname>CYGPKG_PROFILE_GPROF</varname> must first be added to the
eCos configuration. On the command line this can be achieved using:
    </para>
    <screen>
$ ecosconfig add profile_gprof
$ ecosconfig tree
$ make
    </screen>
    <para>
Alternatively the same steps can be performed using the graphical
configuration tool.
    </para>
    <para>
If the HAL packages implement <function>mcount</function> for the
target platform then usually application code should be compiled with
<option>-pg</option>. Optionally eCos itself can also be compiled with
this option by modifying the configuration option
<varname>CYGBLD_GLOBAL_CFLAGS</varname>. Compiling with
<option>-pg</option> is optional but gives more complete profiling
data.
    </para>
    <note><para>
The profiling package itself must not be compiled with
<option>-pg</option> because that could lead to infinite recursion
when doing <function>mcount</function> processing. This is handled
automatically by the package's CDL.
    </para></note>
    <para>
Profiling does not happen automatically. Instead it must be started
explicitly by the application, using a call to
<function>profile_on</function>. A typical example would be:
    </para>
    <programlisting>
#include &lt;pkgconf/system.h&gt;
#ifdef CYGPKG_PROFILE_GPROF
# include &lt;cyg/profile/profile.h&gt;
#endif
&hellip;
int
main(int argc, char** argv)
{
    &hellip;
#ifdef CYGPKG_PROFILE_GPROF
    {
        extern char _stext[], _etext[];
        profile_on(_stext, _etext, 16, 3500);
    }
#endif
    &hellip;
}
    </programlisting>
    <para>
The <function>profile_on</function> takes four arguments:
    </para>
    <variablelist>
      <varlistentry>
        <term><literal>start address</literal></term>
        <term><literal>end address</literal></term>
        <listitem><para>
These specify the range of addresses that will be profiled. Usually
profiling should cover the entire application. On most targets the
linker script will export symbols <varname>_stext</varname> and
<varname>_etext</varname> corresponding to the beginning and end of
code, so these can be used as the addresses. It is possible to
perform profiling on a subset of the code if that code is
located contiguously in memory.
        </para></listitem>
      </varlistentry>
      <varlistentry>
        <term><literal>bucket size</literal></term>
        <listitem><para>
<function>profile_on</function> divides the range of addresses into a
number of buckets of this size. It then allocates a single array of
16-bit counters with one entry for each bucket. When the profiling
timer interrupts the interrupt handler will examine the program
counter of the interrupted code and, assuming it is within the range
of valid addresses, find the containing bucket and increment the
appropriate counter.
        </para>
        <para>
The size of the array counters is determined by the range of addresses
being profiled and by the bucket size. For a bucket size of 16, one
counter is needed for every 16 bytes of code. For an application with
say 512K of code that means dynamically allocating a 64K array. If the
target hardware is low on memory then this may be unacceptable, and
the requirements can be reduced by increasing the bucket size. However
this will affect the accuracy of the results and gprof is more likely
to report the wrong function. It also increases the risk of a counter
overflow.
        </para>
        <para>
For the sake of run-time efficiency the bucket size must be a power of
2, and it will be adjusted if necessary.
        </para></listitem>
      </varlistentry>
      <varlistentry>
        <term><literal>time interval</literal></term>
        <listitem><para>
The final argument specifies the interval between profile timer
interrupts, in units of microseconds. Increasing the interrupt
frequency gives more accurate profiling results, but at the cost of
higher run-time overheads and a greater risk of a counter overflow.
The HAL package may modify this interval because of hardware
restrictions, and the generated profile data will contain the actual
interval that was used. Usually it is a good idea to use an interval
that is not a simple fraction of the system clock, typically 10000
microseconds. Otherwise there is a risk that the profiling timer will
disproportionally sample code that runs only in response to the system
clock.
        </para></listitem>
      </varlistentry>
    </variablelist>
    <para>
<function>profile_on</function> can be invoked multiple times, and
on subsequent invocations, it will delete profiling data
and allocate a fresh profiling range.
    </para>
    <para>
Profiling can be turned off using the function
<function>profile_off</function>:
<programlisting>
void profile_off(void);
</programlisting>
This will also reset any existing profile data.
    </para>
    <para>
If the eCos configuration includes a TCP/IP stack and if a tftp daemon
will be used to <link linkend="gprof-extract">extract</link> the data
from the target then the call to <function>profile_on</function>
should happen after the network is up. <filename>profile_on</filename>
will attempt to start a tftp daemon thread, and this will fail if
networking has not yet been enabled.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -