📄 opt_garbagecollector.mx
字号:
@' The contents of this file are subject to the MonetDB Public License@' Version 1.1 (the "License"); you may not use this file except in@' compliance with the License. You may obtain a copy of the License at@' http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html@'@' Software distributed under the License is distributed on an "AS IS"@' basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the@' License for the specific language governing rights and limitations@' under the License.@'@' The Original Code is the MonetDB Database System.@'@' The Initial Developer of the Original Code is CWI.@' Portions created by CWI are Copyright (C) 1997-2007 CWI.@' All Rights Reserved.@f opt_garbageCollector@a M. Kersten@- Garbage CollectionGarbage collection of temporary variables, such as strings and BATs,takes place upon returning from a function call. Especially for BATsthis may keep sizable resources locked longer than strictly necessary.Although the programmer can influence their lifespan by assignmentof the @sc{nil}, thereby triggering the garbage collector,it is more appropriate to rely on an optimizer to inject these statements. For, it keeps the program smaller and a bettertarget for code-optimizations.The operation @sc{optimizer.garbageCollector()} removes all BAT references that are at their end of life to make room for new ones.It is typically called as one of the last optimizer steps.A snippet of a the effect of the garbage collector:@verbatim t1 := bat.new(:oid,:int); t2 := array.grid(132000,8,1,0); t3 := array.grid(1,100,10560,0); t4 := array.grid(1,100,10560,0,8); t5 := batcalc.+(t2,t4); t6 := batcalc.oid(t5); t7 := algebra.join(t6,t1); optimizer.garbageCollector();@end verbatimis translated into the following code block:@verbatim t1 := bat.new(:oid,:int); t2 := array.grid(132000,8,1,0); t3 := array.grid(1,100,10560,0); t4 := array.grid(1,100,10560,0,8); t5 := batcalc.+(t2,t4); bat.setGarbage(t2); bat.setGarbage(t4); t6 := batcalc.oid(t5); bat.setGarbage(t5); t7 := algebra.join(t6,t1); bat.setGarbage(t6); bat.setGarbage(t1);@end verbatimThe current algorithm is straight forward. After each instructionwe check whether its BAT arguments are needed in the future.If not, we inject a garbage collection statement to release them,provided there are no other reasons to retain it.This should be done carefully, because the instruction may be part of a loop.If the variable is defined inside the loop, we can safely remove it. @{A variable can only be released in the scope in whichit is introduced. This means we need an overview ofthe scope nesting and maintain a list of variablesstill to be garbage collected.We don;t have to worry about pre-mature return from thefunction, because this will trigger garbage collectionanyway.This optimizer should not be called when the schedulerintends to keep intermediates around for re-use.@malpattern optimizer.garbageCollector():straddress OPTgarbageCollector;pattern optimizer.garbageCollector(mod:str, fcn:str):straddress OPTgarbageCollectorcomment "Garbage collector optimizer";@h#ifndef _MAL_GARBAGE_#define _MAL_GARBAGE_#include "opt_support.h"/* #define DEBUG_OPT_GARBAGE show partial result */#endif@c#include "mal_config.h"#include "opt_garbageCollector.h"#include "mal_interpreter.h" /* for showErrors() */#include "mal_builder.h"#include "opt_prelude.h"@-There are two basic ways to release a BAT. The cheapest one isto just assign a nil value, which triggers the decrementof the reference count. The second option is to call a function,which could take care of more things, such as savingpotential interesting results or issueing a memory map advice.Furthermore, it makes sense to only release larger temporary BATsduring the execution, because they may unnecessarily push basetables out of memory.@= releaseBAT{ q= newInstruction(NULL,ASSIGNsymbol); getArg(q,0) = getArg(p,j); pushNil(mb,q, TYPE_bat); pushInstruction(mb,q); typeChecker(s,mb,q,TRUE);}@= releaseBATbyFunction{ q= newInstruction(NULL,ASSIGNsymbol); setModuleId(q,batRef); setFunctionId(q,putName("flush",5)); pushArgument(mb,q,getArg(p,j)); getArg(q,0) = newTmpVariable(mb,TYPE_any); pushInstruction(mb,q); typeChecker(s,mb,q,TRUE);}@-One of the sources for resource consumption are auxilarydatastructures introduced to speed up an algorithm,e.g. building a hash-table.Since such structures are 'dirty' memory pages, they maybecome the target for forced write to disk.To void this situation, the garbage collector injectsan early release of resources.This step is only take against private (=temporary)tables.@= releaseHash{ q= newInstruction(NULL,ASSIGNsymbol); setModuleId(q,batRef); setFunctionId(q,putName("reduce",6)); pushArgument(mb,q,getArg(p,j)); getArg(q,0) = getArg(p,j); pushInstruction(mb,q); typeChecker(s,mb,q,TRUE); actions++;}@cstatic intOPTgarbageCollectorImplementation(MalBlkPtr mb, MalStkPtr stk, InstrPtr pci){ int i, j, k, limit, done; InstrPtr p, q, *stmt; VarPtr v; Client cntxt = MCgetClient(); Module s = cntxt->nspace; int top = 0, blk = 1, actions = 0; str joinPathRef= putName("joinPath",8); (void) pci; (void) stk; /* to fool compilers */ setLifespan(mb); stmt = (InstrPtr *) GDKmalloc(mb->ssize * sizeof(InstrPtr)); memcpy(stmt, mb->stmt, mb->ssize * sizeof(InstrPtr)); memset((char*) mb->stmt,0, mb->ssize * sizeof(InstrPtr)); limit = mb->stop; /* move to stable start */ mb->stop = 0; for (i = 0; i < limit; i++) { p = stmt[i]; pushInstruction(mb, p); for (j = p->retc; j < p->argc; j++) { v = getVar(mb, getArg(p, j)); if (v->endLifespan == i && isaBatType(getArgType(mb, p, j))) { /* avoid duplicate releases */ done = 0; for (k = j - 1; k >= p->retc; k--) if (getArg(p, j) == getArg(p, k)) done++; if (done == 0 ){#ifdef DEBUG_OPT_GARBAGE printf("remove the variable %s at %d\n", getArgName(mb,p,j),i);#endif if (getVarScope(mb, getArg(p, j)) == blk) { /* All persistent BATs are adviced for unmapping. They are recognized at compiletime using the bid property set by e.g. the sqloptimizer *//* Activation of this code block drastically reduced the workingset on TPCH, with a severe performance drop. int *bid= (int*) getPropertyValue(v->props,"bid"); if(bid) @:releaseBATbyFunction()@ else*/ @:releaseBAT()@ actions++; } } } else /* reduce the memory footprint for non-target arguments */ if(getArg(p,0)!= j && isaBatType(getArgType(mb, p, j))) { /* don't touch persistent (SQL) BATs */ int *bid= (int*) getPropertyValue(v->props,"bid"); if( !bid ){ if( getModuleId(p) == algebraRef && ( getFunctionId(p) == joinRef || getFunctionId(p) == joinPathRef || getFunctionId(p) == sortRef || getFunctionId(p) == selectRef || getFunctionId(p) == kdifferenceRef || getFunctionId(p) == kunionRef || getFunctionId(p) == semijoinRef ) ) @:releaseHash()@ if( getModuleId(p)== aggrRef) @:releaseHash()@ if( getModuleId(p)== groupRef) @:releaseHash()@ } } if (blockStart(p)) { blk++; if (top < MAXDEPTH - 2) { } else { mb->errors++; showException(MAL,"optimizer.garbageCollector", "Too deeply nested MAL program"); } } if (blockExit(p)) if (top > 0) { top--; } } }#ifdef DEBUG_OPT_GARBAGE { stream_printf(GDKout, "Garbage collected BAT variables \n"); printFunction(GDKout, mb, LIST_MAL_ALL); stream_printf(GDKout, "End of GCoptimizer\n"); }#endif GDKfree(stmt); return actions;}@include optimizerWrapper.mx@h@:exportOptimizer(garbageCollector)@@c@:wrapOptimizer(garbageCollector,OPT_CHECK_ALL)@@}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -