📄 2.t
字号:
.\" Copyright (c) 1984 M. K. McKusick.\" Copyright (c) 1984 The Regents of the University of California..\" All rights reserved..\".\" Redistribution and use in source and binary forms, with or without.\" modification, are permitted provided that the following conditions.\" are met:.\" 1. Redistributions of source code must retain the above copyright.\" notice, this list of conditions and the following disclaimer..\" 2. Redistributions in binary form must reproduce the above copyright.\" notice, this list of conditions and the following disclaimer in the.\" documentation and/or other materials provided with the distribution..\" 3. All advertising materials mentioning features or use of this software.\" must display the following acknowledgement:.\" This product includes software developed by the University of.\" California, Berkeley and its contributors..\" 4. Neither the name of the University nor the names of its contributors.\" may be used to endorse or promote products derived from this software.\" without specific prior written permission..\".\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION).\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF.\" SUCH DAMAGE..\".\" @(#)2.t 1.3 (Berkeley) 11/8/90.\".ds RH The \fIgprof\fP Profiler.NH 1The \fIgprof\fP Profiler.PPThe purpose of the \fIgprof\fP profiling tool is to help the user evaluate alternative implementationsof abstractions.The \fIgprof\fP design takes advantage of the fact that the kernelthough large, is structured and hierarchical.We provide a profile in which the execution timefor a set of routines that implement anabstraction is collected and chargedto that abstraction.The profile can be used to compare and assess the costs ofvarious implementations [Graham82] [Graham83]..NH 2Data presentation.PPThe data is presented to the user in two different formats.The first presentation simply lists the routineswithout regard to the amount of time their descendants use.The second presentation incorporates the call graph of thekernel..NH 3 The Flat Profile.PPThe flat profile consists of a list of all the routines that are called during execution of the kernel,with the count of the number of times they are calledand the number of seconds of execution time for which theyare themselves accountable.The routines are listed in decreasing order of execution time.A list of the routines that are never called during execution ofthe kernel is also availableto verify that nothing important is omitted bythis profiling run.The flat profile gives a quick overview of the routines that are used,and shows the routines that are themselves responsiblefor large fractions of the execution time.In practice,this profile usually shows that no single functionis overwhelmingly responsible for the total time of the kernel.Notice that for this profile,the individual times sum to the total execution time..NH 3 The Call Graph Profile.PPIdeally, we would like to print the call graph of the kernel,but we are limited by the two-dimensional nature of our outputdevices.We cannot assume that a call graph is planar,and even if it is, that we can print a planar version of it.Instead, we choose to list each routine,together with information aboutthe routines that are its direct parents and children.This listing presents a window into the call graph.Based on our experience,both parent information and child informationis important,and should be available without searchingthrough the output.Figure 1 shows a sample \fIgprof\fP entry..KF.DS L.TSbox center;c c c c c l lc c c c c l lc c c c c l ll n n n c l l. called/total \ \ parentsindex %time self descendants called+self name index called/total \ \ children_ 0.20 1.20 4/10 \ \ \s-1CALLER1\s+1 [7] 0.30 1.80 6/10 \ \ \s-1CALLER2\s+1 [1][2] 41.5 0.50 3.00 10+4 \s-1EXAMPLE\s+1 [2] 1.50 1.00 20/40 \ \ \s-1SUB1\s+1 <cycle1> [4] 0.00 0.50 1/5 \ \ \s-1SUB2\s+1 [9] 0.00 0.00 0/5 \ \ \s-1SUB3\s+1 [11].TE.ceFigure 1. Profile entry for \s-1EXAMPLE\s+1..DE.KE.PPThe major entries of the call graph profile are the entries from theflat profile, augmented by the time propagated to each routine from its descendants.This profile is sorted by the sum of the time for the routineitself plus the time inherited from its descendants.The profile shows which of the higher level routines spend large portions of the total execution time in the routines that they call.For each routine, we show the amount of time passed by each childto the routine, which includes time for the child itselfand for the descendants of the child(and thus the descendants of the routine).We also show the percentage these times represent of the total timeaccounted to the child.Similarly, the parents of each routine are listed, along with time,and percentage of total routine time,propagated to each one..PPCycles are handled as single entities.The cycle as a whole is shown as though it were a single routine,except that members of the cycle are listed in place of the children.Although the number of calls of each memberfrom within the cycle are shown,they do not affect time propagation.When a child is a member of a cycle,the time shown is the appropriate fraction of the timefor the whole cycle.Self-recursive routines have their calls brokendown into calls from the outside and self-recursive calls.Only the outside calls affect the propagation of time..PPThe example shown in Figure 2 is the fragment of a call graphcorresponding to the entry in the call graph profile listingshown in Figure 1..KF.DS L.so fig2.pic.ceFigure 2. Example call graph fragment..DE.KE.PPThe entry is for routine \s-1EXAMPLE\s+1, which hasthe Caller routines as its parents,and the Sub routines as its children.The reader should keep in mind that all informationis given \fIwith respect to \s-1EXAMPLE\s+1\fP.The index in the first column shows that \s-1EXAMPLE\s+1is the second entry in the profile listing.The \s-1EXAMPLE\s+1 routine is called ten times, four times by \s-1CALLER1\s+1,and six times by \s-1CALLER2\s+1.Consequently 40% of \s-1EXAMPLE\s+1's time is propagated to \s-1CALLER1\s+1,and 60% of \s-1EXAMPLE\s+1's time is propagated to \s-1CALLER2\s+1.The self and descendant fields of the parentsshow the amount of self and descendant time \s-1EXAMPLE\s+1propagates to them (but not the time used bythe parents directly).Note that \s-1EXAMPLE\s+1 calls itself recursively four times.The routine \s-1EXAMPLE\s+1 calls routine \s-1SUB1\s+1 twenty times, \s-1SUB2\s+1 once,and never calls \s-1SUB3\s+1.Since \s-1SUB2\s+1 is called a total of five times,20% of its self and descendant time is propagated to \s-1EXAMPLE\s+1'sdescendant time field.Because \s-1SUB1\s+1 is a member of \fIcycle 1\fR,the self and descendant timesand call count fractionare those for the cycle as a whole.Since cycle 1 is called a total of forty times(not counting calls among members of the cycle),it propagates 50% of the cycle's self and descendanttime to \s-1EXAMPLE\s+1's descendant time field.Finally each name is followed by an index that showswhere on the listing to find the entry for that routine..NH 2Profiling the Kernel.PPIt is simple to build a 4.2BSD kernel that will automaticallycollect profiling information as it operates simply by specifying the.B \-poption to \fIconfig\fP\|(8) when configuring a kernel.The program counter sampling can be driven by the system clock,or by an alternate real time clock.The latter is highly recommended as use of the system clock resultsin statistical anomalies in accounting forthe time spent in the kernel clock routine..PPOnce a profiling system has been booted statistic gathering ishandled by \fIkgmon\fP\|(8).\fIKgmon\fP allows profiling to be started and stoppedand the internal state of the profiling buffers to be dumped.\fIKgmon\fP can also be used to reset the state of the internalbuffers to allow multiple experiments to be run withoutrebooting the machine.The profiling data can then be processed with \fIgprof\fP\|(1)to obtain information regarding the system's operation..PPA profiled system is about 5-10% larger in its text space because ofthe calls to count the subroutine invocations.When the system executes,the profiling data is stored in a buffer that is 1.2times the size of the text space.All the information is summarized in memory,it is not necessary to have a trace filebeing continuously dumped to disk.The overhead for running a profiled system varies;under normal load we see anywhere from 5-25%of the system time spent in the profiling code.Thus the system is noticeably slower than an unprofiled system,yet is not so bad that it cannot be used in a production environment.This is important since it allows us to gather datain a real environment rather than trying todevise synthetic work loads.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -