⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 gathering.me

📁 早期freebsd实现
💻 ME
字号:
.\" Copyright (c) 1982, 1993.\"	The Regents of the University of California.  All rights reserved..\".\" Redistribution and use in source and binary forms, with or without.\" modification, are permitted provided that the following conditions.\" are met:.\" 1. Redistributions of source code must retain the above copyright.\"    notice, this list of conditions and the following disclaimer..\" 2. Redistributions in binary form must reproduce the above copyright.\"    notice, this list of conditions and the following disclaimer in the.\"    documentation and/or other materials provided with the distribution..\" 3. All advertising materials mentioning features or use of this software.\"    must display the following acknowledgement:.\"	This product includes software developed by the University of.\"	California, Berkeley and its contributors..\" 4. Neither the name of the University nor the names of its contributors.\"    may be used to endorse or promote products derived from this software.\"    without specific prior written permission..\".\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION).\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF.\" SUCH DAMAGE..\".\"	@(#)gathering.me	8.1 (Berkeley) 6/8/93.\".sh 1 "Gathering Profile Data".ppRoutine calls or statement executions can be measured by having acompiler augment the code at strategic points.The additions can be inline increments to counters [Knuth71][Satterthwaite72] [Joy79] or calls tomonitoring routines [Unix].The counter increment overhead is low, and is suitable forprofiling statements.A call of the monitoring routine has an overhead comparable with acall of a regular routine, and is therefore only suitedto profiling on a routine by routine basis.However, the monitoring routine solution has certain advantages.Whatever counters are needed by the monitoring routine can bemanaged by the monitoring routine itself, rather than beingdistributed around the code.In particular, a monitoring routine can easily be called from separatelycompiled programs.In addition, different monitoring routines can be linked into theprogrambeing measuredto assemble different profiling data without having tochange the compiler or recompile the program.We have exploited this approach;our compilers for C, Fortran77, and Pascal can insert calls to amonitoring routine in the prologue for each routine.Use of the monitoring routine requires no planning on part of aprogrammer other than to request that augmented routineprologues be produced during compilation..ppWe are interested in gathering three pieces of information duringprogram execution: call counts and execution times for each profiled routine, and the arcs of the dynamic call graphtraversed by this execution of the program.By post-processing of this data we can build the dynamic callgraph for this execution of the program and propagate times alongthe edges of this graph to attribute times for routines to theroutines that invoke them..ppGathering of the profiling information should not greatlyinterfere with the running of the program.Thus, the monitoring routine must not produce trace output eachtime it is invoked.The volume of data thus produced would be unmanageably large,and the time required to record it would overwhelm the runningtime of most programs.Similarly, the monitoring routine can not do the analysis ofthe profiling data (e.g. assembling the call graph, propagatingtimes around it, discovering cycles, etc.) during programexecution.Our solution is to gather profiling data in memory during programexecution and to condense it to a file as the profiledprogram exits.This file is then processed by a separate program to produce thelisting of the profile data.An advantage of this approach is that the profile data forseveral executions of a program can be combined by thepost-processing to provide a profile of manyexecutions..ppThe execution time monitoring consists of three parts.The first part allocates and initializes the runtime monitoring datastructures before the program begins execution.The second part is the monitoring routine invoked from theprologue of each profiled routine.The third part condenses the data structures and writes themto a file as the program terminates.The monitoring routine is discussed in detail in the following sections..sh 2 "Execution Counts".ppThe \fBgprof\fP monitoring routine counts the number of timeseach profiled routine is called.The monitoring routine also records the arc in the call graphthat activated the profiled routine.The count is associated with the arc in the call graphrather than with the routine.Call counts for routines can then be determined by summing the countson arcs directed into that routine.In a machine-dependent fashion, the monitoring routine notes itsown return address.This address is in the prologue of some profiled routine that isthe destination of an arc in the dynamic call graph.The monitoring routine also discovers the return address for thatroutine, thus identifying the call site, or source of the arc.The source of the arc is in the \fIcaller\fP, and the destination is inthe \fIcallee\fP.For example, if a routine A calls a routine B, A is the caller, and B is the callee.The prologue of B will include a call to the monitoring routinethat will note the arc from A to B and either initialize orincrement a counter for that arc..ppOne can not afford to have the monitoring routine output tracinginformation as each arc is identified.Therefore, the monitoring routine maintains a table of all thearcs discovered, with counts of the numbers of times each istraversed during execution.This table is accessed once per routine call.Access to itmust be as fast as possible so as not to overwhelm the timerequired to execute the program..ppOur solution is to access the table through a hash table.We use the call site as the primary key with the calleeaddress being the secondary key.Since each call site typically calls only one callee, we canreduce (usually to one) the number of minor lookups based on the callee.Another alternative would use the callee as the primary key and thecall site as the secondary key.Such an organization has the advantage of associating callers withcallees, at the expense of longer lookups in the monitoringroutine.We are fortunate to be running in a virtual memory environment,and (for the sake of speed) were able to allocate enough spacefor the primary hash table to allow a one-to-one mapping fromcall site addresses to the primary hash table.Thus our hash function is trivial to calculate and collisionsoccur only for call sites that call multipledestinations (e.g. functional parameters and functional variables).A one level hash function using both call site and callee wouldresult in an unreasonably large hash table.Further, the number of dynamic call sites and callees is not known duringexecution of the profiled program..ppNot all callers and callees can be identified by the monitoringroutine.Routines that were compiled without the profiling augmentationswill not call the monitoring routine as part of their prologue,and thus no arcs will be recorded whose destinations are in theseroutines.One need not profile all the routines in a program.Routines that are not profiled run at full speed.Certain routines, notably exception handlers, are invoked bynon-standard calling sequences.Thus the monitoring routine may know the destination of an arc(the callee),but find it difficult orimpossible to determine the source of the arc (the caller).Often in these cases the apparent source of the arc is not a callsite at all.Such anomalous invocations are declared ``spontaneous''..sh 2 "Execution Times".ppThe execution times for routines can be gathered in at least twoways.One method measures the execution time of a routine by measuringthe elapsed time from routine entry to routine exit.Unfortunately, time measurement is complicated on time-sharingsystems by the time-slicing of the program.A second method samples the value of the program counter at someinterval, and infers execution time from the distribution of thesamples within the program.This technique is particularly suited to time-sharing systems,where the time-slicing can serve as the basis for samplingthe program counter.Notice that, whereas the first method could provide exact timings,the second is inherently a statistical approximation..ppThe sampling method need not require support from the operatingsystem:  all that is needed is the ability to set and respond to``alarm clock'' interrupts that run relative to program time.It is imperative that the intervals be uniform since thesampling of the program counter rather than the duration of theinterval is the basis of the distribution.If sampling is done too often, the interruptions to sample theprogram counter will overwhelm the running of the profiled program.On the other hand, the program must run for enough sampledintervals that the distribution of the samples accuratelyrepresents the distribution of time for the execution of theprogram.As with routine call tracing, the monitoring routine can notafford to output information for each program countersample.In our computing environment, the operating system can provide ahistogram of the location of the program counter at the end ofeach clock tick (1/60th of a second) in which a program runs.The histogram is assembled in memory as the program runs.This facility is enabled by our monitoring routine.We have adjusted the granularity of the histogram so thatprogram counter values map one-to-one onto the histogram.We make the simplifying assumption that all calls to a specificroutine require the same amount of time to execute.This assumption may disguise that some calls(or worse, some call sites) always invoke a routinesuch that its execution is faster (or slower)than the average time for that routine..ppWhen the profiled program terminates, the arc table and the histogram ofprogram counter samples is written to a file.The arc table is condensed to consist of the source and destinationaddresses of the arc and the count of the number of times the arcwas traversed by this execution of the program.The recorded histogram consists of counters of the number oftimes the program counter was found to be in each of the ranges coveredby the histogram.The ranges themselves are summarized as alower and upper bound and a step size.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -