📄 mal.mx
字号:
@' The contents of this file are subject to the MonetDB Public License@' Version 1.1 (the "License"); you may not use this file except in@' compliance with the License. You may obtain a copy of the License at@' http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html@'@' Software distributed under the License is distributed on an "AS IS"@' basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the@' License for the specific language governing rights and limitations@' under the License.@'@' The Original Code is the MonetDB Database System.@'@' The Initial Developer of the Original Code is CWI.@' Portions created by CWI are Copyright (C) 1997-2007 CWI.@' All Rights Reserved.@f mal@-@node Design Considerations, Architecture Overview, Design Overview, Design Overview@+ Design ConsiderationsRedesign of the MonetDB software stack was driven by the need toreduce the effort to extend the system into novel directionsand to reduce the Total Execution Cost (TEC).The TEC is what an end-user or application program will notice. The TEC is composed on several cost factors:@itemize @item A)API message handling@item P)Parsing and semantic analysis@item O)Optimization and plan generation@item D)Data access to the persistent store@item E)Execution of the query terms@item R)Result delivery to the application@end itemizeChoosing an architecture for processing database operations pre-supposes anintuition on how the cost will be distributed. In an OLTPsetting you expect most of the cost to be in (P,O), while in OLAP it willbe (D,E,R). In a distributed setting the components (O,D,E) are dominant.Web-applications would focus on (A,E,R).Such a simple characterization ignores the wide-spreaddifferences that can be experienced at each level. To illustrate,in D) and R) it makes a big difference whether the data is already in thecache or still on disk. With E) it makes a big difference whether youare comparing two integers, evaluation of a mathematical function,e.g. Gaussian, or a regular expression evaluation on a string.As a result, intense optimization in one area may become completely invisibledue to being overshadowed by other cost factors.The Version 5 infrastructure is designed to ease addressing eachof these cost factors in a well-defined way, while retaining theflexibility to combine the components needed for a particular situation.This results in an architecture to assemble the componentsfor a particular application domain and hardware platform.The primary interface to the database kernel is still based onthe exchange of text in the form of queries and simply formatted results.This interface is designed for ease of interpretation, versatility andis flexible to accommodate system debugging and application tool development.Although a textual interface potentially leads to a performance degradation,our experience with earlier system versionsshowed that the overhead can be kept within acceptable bounds.Moreover, a textual interface reduces the programmingeffort otherwise needed to develop test and application programs.The XML trend as the language for tool interaction supports our decision.@-@node Architecture Overview, MAL Synopsis, Design Considerations, Design Overview@+ Architecture OverviewThe architecture is built around a few independent components:the MonetDB server, the MonetDB guardian, and the client application.The MonetDB server is the heart of the system, it manages a singlephysical database on one machine for all (concurrent) applications.The guardian program works along side a single server, keepingan eye on its behavior. If the server accidently crashes, it is this programthat will attempt an automatic restart.The server and the guardian are managed with the @sc{monetdb} script,introduced in @ref{Start and Stop}.The top layer consists of applications written in your favoritelanguage.They provide both specific functionalityfor a particular product, e.g. @url{http://kdl.cs.umass.edu/software,Proximity},and generic functionality, e.g. the @url{http://www.aquafold.com,Aquabrowser} or @url{http://www.minq.se,Dbvisualizer}.The applications communicate with the server using de-facto standard interface packaged,e.g. JDBC, ODBC, Perl, PHP, etc..The middle layer consists of query language processors such as SQL and XQuery. The former supports the core functionalityof SQL'99 and extends into SQL'03. The latter is based onthe W3C standard and includes the XUpdate functionality.The query language processors each manage their own private catalog structure.Software bridges, e.g. import/export routines, are used toshare data between language paradigms.@iftex@image{base00,,,,.pdf}@emph{Figure 2.1}@end iftex@-@node MAL Synopsis, Execution Engine, Architecture Overview, Design Overview@+ MonetDB Assembly Language (MAL)The target language for a query compiler is the MonetDB Assembly Language (MAL).It was designed to ease code generation and fast interpretation by the server.The compiler produces algebraic query plans, which are turned into physical executionplans by the MAL optimizers.The output of a compiler is either an @sc{ascii} representationof the MAL program or the compiler is tightly coupled withthe server to save parsing and communication overhead.A snippet of the MAL code produced by the SQL compilerfor the query @sc{select count(*) from tables}is shown below. It illustrates a sequences of relationaloperations against a table column and producing apartial result.@example ... _22:bat[:oid,:oid] := sql.bind_dbat("tmp","_tables",0); _23 := bat.reverse(_22); _24 := algebra.kdifference(_20,_23); _25 := algebra.markT(_24,0:oid); _26 := bat.reverse(_25); _27 := algebra.join(_26,_20); _28 := bat.setWriteMode(_19); bat.append(_28,_27,true); ...@end exampleMAL supports the full breath of computational paradigms deployed in a database setting. It is language frameworkwhere the execution semantics is determined by thecode transformations and the final engine choosen.The design and implementation of MAL takes the functionality offeredpreviously a significant step further. To name a few:@itemize @bullet@item All instructions are strongly typed before being executed. @item Polymorphic functions are supported. They act as templates that produce strongly typed instantiations when needed.@item Function style expressions where each assignment instruction can receive multiple target results;it forms a point in the dataflow graph. @item Co-routines (Factories) support building streaming applications.@item Properties are associated with the program code forease of optimization and scheduling.@item It can be readily extended with user defined types andfunction modules. @end itemize@-@{@+ Critical sections and semaphoresMonet V5 is implemented as a collection of threads. This calls for extremecare in coding. At several places locks and semaphores are necessaryto achieve predictable results. In particular, after they are createdand when they are inspected or being modified to take decisions. In the current implementation the following list of locks and semaphoresis used in the Monet layer: @mal@+ Monet Basic Definitions Definitions that need to included in every file of the Monet system,as well as in user defined module implementations.@h#ifndef _MAL_H#define _MAL_H#include <gdk.h>#include <gdk_utils.h>#include <stream.h>#ifdef WIN32#ifndef LIBMAL#define mal_export extern __declspec(dllimport)#else#define mal_export extern __declspec(dllexport)#endif#else#define mal_export extern#endif@+ Monet Calling OptionsThe number of invocation arguments is kept to a minimum.See the monetdb5.conf file for additional system variable settings@@h#define MAXSCRIPT 64mal_export char monet_cwd[PATHLENGTH];mal_export int monet_welcome; mal_export str *monet_script;mal_export int monet_daemon;#define mal_set_lock(X,Y) if(GDKprotected) MT_set_lock(X,Y)#define mal_unset_lock(X,Y) if(GDKprotected) MT_unset_lock(X,Y)#define mal_up_sema(X,Y) if(GDKprotected) MT_up_sema(X,Y)#define mal_down_sema(X,Y) if(GDKprotected) MT_down_sema(X,Y)@c #include <mal_config.h>#include <mal.h>char monet_cwd[PATHLENGTH] = { 0 };int monet_welcome = 1;str *monet_script;int monet_daemon=0;@}@-@node Execution Engine, Session Scenarios, MAL Synopsis , Design Overview@+ Execution EngineThe execution engine comes in several flavors. The default is asimple, sequential MAL interpreter. For each MAL function call it createsa stack frame, which is initialized with all constants found in thefunction definition. During interpretation the garbage collectorensures freeing of space consumptive tables (BATs) and strings.Furthermore, all temporary structures are garbage collected beforethe funtion returns the result.This simple approach leads to an accumulation of temporary variables.They can be freed earlier in the process using an explicit garbage collectioncommand, but the general intend is to leave such decisions to an optimizeror scheduler.The execution engine is only called when all MAL instructionscan be resolved against the available libraries.Most modules are loaded when the server starts using a bootstrap script @sc{mal_init.mx}Failure to find the startup-file terminates the session.It most likely points to an error in the MonetDB configuration file.During the boot phase, the global symbol table is initialized with MAL function and factory definitions, andloading the pre-compiled commands and patterns. The libraries are dynamically loaded by default.Expect tens of modules and hundreds of operations to become readily available.Modules can not be dropped without restarting the server.The rational behind this design decision is that a dynamic load/drop featureis often hardly used and severely complicates the code base.In particular, upon each access to the global symbol table we have to beprepared that concurrent threads may be actively changing its structure.Especially, dropping modules may cause severe problems by not beingable to detect all references kept around.This danger required all accesses to global information to be packagedin a critical section, which is known to be a severe performance hindrance.@{@hmal_export MT_Lock mal_contextLock;mal_export int mal_init(void);mal_export void mal_exit(void);/* This should be here, but cannot, as "Client" isn't known, yet ... |-( * For now, we move the prototype declaration to src/mal/mal_client.c, * the only place where it is currently used. Maybe, we should concider * also moving the implementation there... */#define MALprofiler 1 /* activate the profiler *//* #undef MALprofiler*//* Listing modes are globally known */#define LIST_INPUT 1 /* echo original input */#define LIST_MAL_INSTR 2 /* show mal instruction */#define LIST_MAL_TYPE 4 /* show type resolutoin */#define LIST_MAL_PROPS 8 /* show optimizer properties */#define LIST_MAL_ALL (LIST_MAL_INSTR | LIST_MAL_TYPE | LIST_MAL_PROPS )#define STRUCT_ALIGNED#ifndef MAXPATHLEN#define MAXPATHLEN 1024#endif#endif /* _MAL_H*/@c#include "mal_config.h"#include "mal_linker.h"#include "mal_session.h"#include "mal_parser.h"#include "mal_interpreter.h"#include "mal_namespace.h" /* for initNamespace() */#include "mal_client.h"#include "mal_sabaoth.h"MT_Lock mal_contextLock;@-Initialization of the MAL contextThe compiler directive STRUCT_ALIGNED tells that thefields in the VALrecord all start at the same offset.This knowledge avoids low-level type decodings, but shouldbe assured at least once for each platform.@cstaticvoid tstAligned(void){ int allAligned=0; ValRecord v; ptr val, base; base = (ptr) & v.val.ival; val= (ptr) & v.val.bval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.cval[0]; if(val != base){ allAligned = -1; } val= (ptr) & v.val.shval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.br.id; if(val != base){ allAligned = -1; } val= (ptr) & v.val.ival; if(val != base){ allAligned = -1; } val= (ptr) & v.val.oval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.pval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.fval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.dval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.lval; if(val != base){ allAligned = -1; } val= (ptr) & v.val.sval; if(val != base){ allAligned = -1; }#ifdef STRUCT_ALIGNED if(allAligned<0) GDKfatal("Recompile with STRUCT_ALIGNED flag disabled\n");#else if(allAligned==0) GDKfatal("Recompile with STRUCT_ALIGNED flag enabled\n");#endif}int mal_init(){ MT_lock_init( &mal_contextLock); tstAligned(); initNamespace(); initParser(); if( malBootstrap() == 0) { showErrors(); return -1; } return 0;}@-Upon exit we should attempt to remove all allocated memory explicitly.This seemingly superflous action is necessary to simplify analyis ofmemory leakage problems later on.@cvoid mal_exit(void){ Client cntxt = mal_clients; int t = 0; str err;#ifdef MALprofiler stream *f; f= open_wastream("/tmp/Monet.prof"); if( f != NULL){ profileReport( cntxt->nspace,1, f); close_stream(f); }#endif#if 0 /* skip this to solve random crashes, needs work */ freeBoxes(); freeModuleList(cntxt->nspace); mal_scope = 0; unloadLibraries(); finishNamespace(); if( cntxt->cwd) GDKfree(cntxt->cwd); if( cntxt->prompt) GDKfree(cntxt->prompt); if( cntxt->errbuf) GDKfree(cntxt->errbuf); if( cntxt->bak) GDKfree(cntxt->bak); if( cntxt->fdin){ /* missing protection against closing stdin stream */ (void) stream_close(cntxt->fdin->s); (void) stream_destroy(cntxt->fdin->s); (void) bstream_destroy(cntxt->fdin); } if( cntxt->fdout && cntxt->fdout != GDKstdout) { (void) stream_close(cntxt->fdout); (void) stream_destroy(cntxt->fdout); }#endif /* deregister everything that was registered, ignore errors */ if ((err = SABAOTHwildRetreat(&t)) != MAL_SUCCEED) { fprintf(stderr, "!%s", err); GDKfree(err); } /* the server will now be shut down */ if ((err = SABAOTHregisterStop(&t)) != MAL_SUCCEED) { fprintf(stderr, "!%s", err); GDKfree(err); }/* GDKexit(0); */ MT_global_exit(0);}@}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -