📄 mal_profiler.mx
字号:
@' The contents of this file are subject to the MonetDB Public License@' Version 1.1 (the "License"); you may not use this file except in@' compliance with the License. You may obtain a copy of the License at@' http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html@'@' Software distributed under the License is distributed on an "AS IS"@' basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the@' License for the specific language governing rights and limitations@' under the License.@'@' The Original Code is the MonetDB Database System.@'@' The Initial Developer of the Original Code is CWI.@' Portions created by CWI are Copyright (C) 1997-2007 CWI.@' All Rights Reserved.@a M. Kersten@v 0.0@* The MAL ProfilerA key issue in the road towards a high performance implementationis to understand where resources are being spent. This informationcan be obtained using different tools and at different levels ofabstraction.A coarse grain insight for a particular application can be obtainedusing injection of the necessary performancecapturing statements in the instruction sequence.Fine-grain, platform specific information can be obtained usingexisting profilers, like valgrind (http://www.valgrind.org),or hardware performance counters.The MAL profiler collects detailed performance information, suchas cpu, memory and statement information. It is optionally extendedwith IO activity, which is needed for coarse grain profiling only.The execution profiler is supported by hooks in the MAL interpreter.The default strategy is to ship an event record immediately over a streamto a separate performance monitor, formatted as a tuple.An alternative strategy is preparation for off-line performance analysis.For this case the event record is turned into a XML structure andsent over the stream (most likely linked with a file)Reflective performance analysis is supported by an event cache,emptied explicitly upon need.@menu* Event Filtering ::* Event Caching::@end menu@node Event Filtering, Event Caching, The MAL Profiler, The MAL Profiler@+ Event Filtering The profiler supports selective retrieval of such information bytagging the instructions of interest. This means that a profilercall has a global effect,all concurrent users are affected by the performance overhead.Therefore, it is of primary interest to single user sessions.The example below illustrates how the different performancecounter groups are activated, instructions are filtered fortracking, and where the profile information is retained fora posteriori analysis. The tuple format is chosen.@exampleprofiler.activate("time");profiler.activate("ticks");#profiler.activate("cpu");#profiler.activate("memory");#profiler.activate("io");#profiler.activate("pc");#profiler.activate("event");profiler.activate("statement");profiler.setFilter("*","insert");profiler.setFilter("*","print");profiler.openStream("/tmp/MonetDBevents");profiler.start();b:= bbp.new(:int,:int);bat.insert(b,1,15);bat.insert(b,2,4);bat.insert(b,3,9);io.print(b);profiler.stop();profiler.closeStream();@end exampleThe information on CPU, IO, memory use, program counterand the event identifier are not retained. Furthermore, we are interested in allfunctions name @sc{insert} and @sc{print}. A wildcardcan be used to signify any name, e.g.no constraints are put on the module in which the operationsare defined.Execution of the sample leads to the creation of a file withthe following content. The ticks are measured in micro-seconds.@verbatim# time, ticks, stmt, # name[ "15:17:56", 12, "_27 := bat.insert(<tmp_15>,1,15);", ][ "15:17:56", 2, "_30 := bat.insert(<tmp_15>,2,4);", ][ "15:17:56", 2, "_33 := bat.insert(<tmp_15>,3,9);", ][ "15:17:56", 245, "_36 := io.print(<tmp_15>);", ]@end verbatimThe XML-formatted output is produced by encapsulationof a code block with the pair @sc{profiler.setOfflineProfiling()}.The default is tuple format, which can be requestedexplicitly using @sc{profiler.setOnlineProfiling()}.@node Event Caching, The MAL Modules, Event Filtering, The MAL Profiler@+ Event CachingAside from shipping events to a separate process, the profilercan keep the events in a local @sc{BAT} group, triggered with the command @sc{profiler.setCachedProfiling()}.Ofcourse, every measurement scheme does not come for free and mayeven obscure performance measurements obtained throug e.g. valgrind.The separate event caches can be accessed using the operator @sc{profiler.getTrace}(@emph{name}).The current implementation only supports access to @sc{time},@sc{ticks},@sc{pc},@sc{statement}.The event cache can be cleared with @sc{profiler.clearTrace()}.Consider the following MAL program snippet:@exampleprofiler.setAll();profiler.setCachedProfiling();profiler.startTrace();b:= bbp.new(:int,:int);bat.insert(b,1,15);io.print(b);profiler.stopTrace();s:= profiler.getTrace("statement");t:= profiler.getTrace("ticks");io.print(s,t);@end exampleThe result of the program execution becomes:@verbatim#-----------------## h t # name# int int # type#-----------------#[ 1, 15 ]#---------------------------------------------------------## h t t # name# int str int # type#---------------------------------------------------------#[ 1, "b := bbp.new(0,0);", 51 ][ 2, "$6 := bat.insert(<tmp_22>,1,15);", 16 ][ 3, "$9 := io.print(<tmp_22>);", 189 ]@end verbatim@{The profiler is activated by defining MALprofiler in the source tree.This leads to securing space for the counters and for codein the interpreter to collect the information.@h#ifndef _MAL_PROFILER_H#define _MAL_PROFILER_H#include "mal_client.h"#ifdef HAVE_SYS_TIMES_H# include <sys/times.h>#endif#if defined(SOLARIS) && OSVER < 560#include "/usr/ucbinclude/sys/rusage.h"#include "/usr/ucbinclude/sys/resource.h"#endif#ifdef HAVE_SYS_RESOURCE_H#include <sys/resource.h>typedef struct rusage Rusage;#endiftypedef struct tms Tms;typedef struct mallinfo Mallinfo;mal_export str activateCounter(str name);mal_export str deactivateCounter(str name);mal_export str openProfilerStream(void);mal_export str closeProfilerStream(void);mal_export void initProfiler(MalBlkPtr mb);mal_export void profilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc);mal_export str setLogFile(Module cntxt, str fname);mal_export str setStartPoint(Module cntxt, str mod, str fcn);mal_export str setEndPoint(Module cntxt, str mod, str fcn);mal_export void setCachedProfiling(void);mal_export void setOnlineProfiling(void);mal_export void setOfflineProfiling(void);mal_export int profilerAvailable(void);mal_export str startProfiling(void);mal_export str stopProfiling(void);mal_export str cleanupProfiler(void);mal_export int instrFilter(InstrPtr pci, str mod, str fcn);mal_export void setFilter(Module cntxt, str mod, str fcn);mal_export void resetFilter(Module cntxt, str mod, str fcn);mal_export void profileReport(Module cntxt, int outer, stream *out);mal_export void MPresetProfiler(stream *fdout);mal_export int malProfileMode;mal_export void startTrace(void);mal_export void stopTrace(void);mal_export void clearTrace(void);mal_export BAT *getTrace(str ev);#define DEBUG_MAL_PROFILER#endif@+ Performance tracingThe interpreter comes with several variables to hold performancerelated data.Every MAL instruction record is extended with two fields: counter and timer.The counter is incremented each time the instruction is taken intoexecution. Upon return, the timer is incremented with the microsecondsspent.In addition to the default performance data collection,the user can request performance events to be collected on a statementbasis. Care should be taken, because it leads to a large trace file,unless the results are directly passed to a performance monitorfront-end for filtering and summarization.The performance monitor has exclusive access to the event file, whichavoid concurrency conflicts amongst clients. It avoid clutheredevent records on the event stream. Since this event stream is ownedby a client, we should ensure that the profiler is automatically bereset once the owner leaves. The routine profilerReset() handles the case.@c#include "mal_config.h"#include "mal_function.h"#include "mal_profiler.h"#include "mal_debugger.h"#ifdef MALprofilerstatic MT_Lock profileLock ;static int profileLockInitialized = 0;static stream *eventstream = 0;static int onlineProfiling = TRUE;static int offlineProfiling = FALSE;static int cachedProfiling = FALSE;intprofilerAvailable(void){ return 1;}#elseintprofilerAvailable(void){ return 0;}#endifstatic void onlineProfilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc);static void offlineProfilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc);static void cachedProfilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc);int malProfileMode = 0; /* global flag to indicate profiling mode */static int profileAll = 0; /* all instructions should be profiled */static int delayswitch = 0; /* to wait before sending the profile info */@c#define PROFtime 0#define PROFlabel 1#define PROFcpu 2#define PROFmemory 3#define PROFio 4#define PROFpc 5#define PROFevent 6#define PROFstmt 7static struct { str name; /* which logical counter is needed */ int status; /* trace it or not */} profileCounter[] = { /* 0 */ { "time", 0}, /* 1 */ { "ticks", 0}, /* 2 */ { "cpu", 0}, /* 3 */ { "memory", 0}, /* 4 */ { "io", 0}, /* 5 */ { "pc", 0}, /* 6 */ { "event", 0}, /* 7 */ { "statement", 0}, /* 8 */ { 0, 0}};@-The counters can be set individually. The XML output uses the order of theirdefinition.@= setCounter int i; for(i=0; profileCounter[i].name;i++) if( strcmp(profileCounter[i].name,name)==0){ profileCounter[i].status = @1 ; return 0; } throw(MAL, "@2", "profile counter '%s' undefined",name);@cstractivateCounter(str name){ @:setCounter(1, activateCounter)@}strdeactivateCounter(str name){ @:setCounter(0, deactivateCounter)@}@-The parameter of getEventStream is the return value of the enclosing function: profilerEvent: none/void setStartPoint: a str setEndPoint: a str@= getEventStream if( ! profileLockInitialized) { MT_lock_init(&profileLock); profileLockInitialized= 1; } mal_set_lock(profileLock,"profileLock"); if( eventstream == NULL) { mal_unset_lock(profileLock,"profileLock"); return @1; }@- Event dispatchingThe profiler strategy is encapsulated hereOnline- and offline-profiling are disjoint options, because theycurrently share the same file descriptor in the client record.@cvoidprofilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc){#ifdef MALprofiler if( mb->profiler == NULL) return; if (onlineProfiling) onlineProfilerEvent(cntxt, mb, stk, pc); if (offlineProfiling) offlineProfilerEvent(cntxt, mb, stk, pc); if (cachedProfiling) cachedProfilerEvent(cntxt, mb, stk, pc);#endif}@+ Online procecingThe online processing structure is the easiest. We merely have toproduce a correct tuple format for the front-end.@c#define log(X,Y) stream_printf(eventstream,X,Y);#define log0(X) stream_printf(eventstream,X);#define log2(X,Y,Z) stream_printf(eventstream,X,Y,Z);#define flushLog() stream_flush(eventstream);#ifdef MALprofilerstatic voidonlineProfilerHeader(void){ @:getEventStream()@ /* profilerEvent -> void */ log0("# "); if (profileCounter[PROFpc].status) { log0("caller,\t"); log0("callee,\t"); } if (profileCounter[PROFtime].status) { log0("time,\t"); } if (profileCounter[PROFlabel].status) { log0("ticks,\t"); } if (profileCounter[PROFstmt].status) { log0("stmt,\t"); } if (profileCounter[PROFcpu].status) { log0("utime,\t"); log0("cutime,\t"); log0("stime,\t"); log0("cstime,\t"); } if (profileCounter[PROFmemory].status) { log0("maxrss,\t"); log0("arena,\t"); log0("ordblks,\t"); log0("smblks,\t"); log0("hblkhd,\t"); log0("hblks,\t"); log0("fsmblks,\t"); log0("uordblks,\t"); } if (profileCounter[PROFio].status) { log0("page reclaim,\t"); log0("page faults,\t"); log0("swaps,\t"); log0("block reads,\t"); log0("block writes,\t"); log0("context switch,\t"); log0("involunarty switch,\t"); } if (profileCounter[PROFevent].status) { log0("event\t# name\n"); } else log0("# name\n"); flushLog(); mal_unset_lock(profileLock, "profileLock");}voidonlineProfilerEvent(Module cntxt, MalBlkPtr mb, MalStkPtr stk, int pc){ static struct mallinfo prevMalloc; InstrPtr pci = getInstrPtr(mb,pc);#ifdef HAVE_SYS_RESOURCE_H static struct rusage prevUsage; struct rusage infoUsage;#endif static int eventcounter;#ifdef HAVE_TIMES struct tms newTms;#endif struct mallinfo infoMalloc; str stmt, c; (void) cntxt; if (delayswitch > 0) { /* first call to profiled */ onlineProfilerHeader(); delayswitch--; return; } @:getEventStream()@ /* profilerEvent -> void */ if (delayswitch == 0) { delayswitch = -1; } if (!profileAll && mb->profiler[pc].trace == FALSE) return;#ifdef HAVE_TIMES times(&newTms);#endif infoMalloc = MT_mallinfo();#ifdef HAVE_SYS_RESOURCE_H getrusage(RUSAGE_SELF, &infoUsage);#endif /* make basic profile event tuple */ log0("[ "); if (profileCounter[PROFpc].status) { str nme = ""; if (stk->blk) nme = getFunctionId(getInstrPtr(stk->blk, 0)); /* get identity of caller */ log2("%s.%d,\t", nme, getPC(mb, pci)); if (getModuleId(pci) && getFunctionId(pci)) { log2("\"%s.%s\",\t", getModuleId(pci), getFunctionId(pci)); } else { log("\"%s\",\t", operatorName(pci->token)); } } if (profileCounter[PROFtime].status) { char *tbuf, *c; tbuf= ctime(&mb->profiler[pc].clock); c = strchr(tbuf, '\n'); if (c) { c[-5] = '"'; c[-4] = 0; } tbuf[10] = '"'; log("%s,\t", tbuf + 10);
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -