⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 algebra.mx

📁 一个内存数据库的源代码这是服务器端还有客户端
💻 MX
📖 第 1 页 / 共 5 页
字号:
		not match an head-value in 'inner'.";command outerjoin( outer:bat[:any_1,:oid], inner:bat[:oid,:any_3]) 		:bat[:any_1,:any_3] address ALGouterjoincomment "Returns all the result of a join, plus the BUNS formed NIL in 		the tail and the head-values of 'outer' whose tail-value does 		not match an head-value in 'inner'.";command outerjoin( outer:bat[:any_1,:oid], inner:bat[:oid,:any_3]) 		:bat[:any_1,:any_3] address ALGouterjoincomment "Returns all the result of a join, plus the BUNS formed NIL in 		the tail and the head-values of 'outer' whose tail-value does 		not match an head-value in 'inner'.";command outerjoin( outer:bat[:any_1,:any_2], inner:bat[:any_2,:any_3],		estimate:lng) :bat[:any_1,:any_3] address ALGouterjoinestimate;@- Theta Join@malcommand thetajoin( left:bat[:any_1,:any_2], right:bat[:any_2,:any_3],		opname:int) :bat[:any_1,:any_3] address ALGthetajoincomment "Theta join on for 'mode' in { LE, LT, EQ, GT, GE }.  JOIN_EQ is 		just the same as join(). All other options do merge algorithms. 		Either using the fact that they are ordered() already (left on tail, 	right on head), or by using/creating binary search trees on the 		join columns. ";command thetajoin( left:bat[:any_1,:any_2], right:bat[:any_2,:any_3],		opname:int,estimate:lng) :bat[:any_1,:any_3] address ALGthetajoinEstimate;@- Band Join (approximate match)@malcommand bandjoin( outer:bat[:any_1,:any_2], inner:bat[:any_2,:any_3],		   minus:any_2 , plus:any_2 ) :bat[:any_1,:any_3] address ALGbandjoincomment "This is a join() for which the predicate is that two BUNs match 		if the left-tail value is within the range [right-head - minus, 		right-head + plus]. Works only for the builtin numerical types, 		and their derivates.";@+ Projection operations@malcommand project(b:bat[:any_1,:any_2]) :bat[:any_1,:oid]address ALGprojectNILcomment "Extract the head of a BAT.";@= projectGrpcommand project(v:@1,b:bat[:any_2,:any_1]) :bat[:@1,:any_1] address ALGprojecthead_@1comment "Fill the head with a constant, e.g. [0~b]";command project(b:bat[:any_2,:any_1],v:@1) :bat[:any_2,:@1] address ALGprojecttail_@1comment "Fill the tail with a constant, e.g. [0~b]";@mal	@:projectGrp(bit)@	@:projectGrp(chr)@	@:projectGrp(str)@	@:projectGrp(oid)@	@:projectGrp(int)@	@:projectGrp(sht)@	@:projectGrp(lng)@	@:projectGrp(flt)@	@:projectGrp(dbl)@@+ OID Introducing CommandsFor relational processing, some operators are necessary to produce newlyinitiated OID columns, for representing n-ary (intermediary) relations.@malcommand markT( b:bat[:any_1,:any_2], base:oid ) :bat[:any_1,:oid] address ALGtmarkcomment "Produces a BAT with fresh unique dense sequense of OIDs in 		the tail that starts at base (i.e. [base,..base+b.count()-1] ).";command markT( b:bat[:any_1,:any_2] ) :bat[:any_1,:oid] address ALGtmark_defaultcomment "Produces a BAT with fresh unique OIDs in the tail starting at 0@0.";command markH( b:bat[:any_1,:any_2] ) :bat[:oid,:any_2] address ALGmarkHead_defaultcomment "Produces a BAT with fresh OIDs in the head starting at 0@0.";command markH( b:bat[:any_1,:any_2], base:oid ) :bat[:oid,:any_2] address ALGmarkHeadcomment "Produces a new BAT with fresh unique dense sequense of OIDs in 		the head that starts at base (i.e. [base,..base+b.count()-1] ).";command number( b:bat[:any_1,:any] ) :bat[:any_1,:int]  address ALGnumbercomment "Produces a new BAT with identical head column, and consecutively 		increasing integers (start at 0) in the tail column.";command merge(b:bat[:oid,:oid]):bat[:lng,:oid]address ALGmergecomment "Merge head and tail into a single value";command split(b:bat[:lng,:oid]):bat[:oid,:oid]address ALGsplitcomment "Split head into two values";@+ BAT fragmentation commandsVarious operations for splitting BATs into useful fragments.@- Variable managementIt is sometimes needed to cast a type at runtime@malcommand materialize(b:bat[:oid,:any_1]):bat[:oid,:any_1]address ALGmaterializecomment "Materialize the void column";command reuse(b:bat[:any_1,:any_2]):bat[:any_1,:any_2]address ALGreusecomment "Reuse a temporary BAT if you can. Otherwise,	allocate enough storage to accept result of an 	operation (not involving the heap)";@- Hash SplitThe commands below is temporarilly postponed, becausewe don;t handle nested bats@malcommand hashsplit( b:bat[:any_1,:any_2], buckects:int ) 		:bat[:int,:bat] address ALGhashsplitcomment "Split a BAT on tail column according (hash-value MOD buckets). 		Returns a recursive BAT, containing the fragments in the tail, 		their bucket number in the head.";command uhashsplit ( b:bat[:any_1,:any_2], buckets:int ) :bat[:int,:bat] address ALGuhashsplitcomment "Same as hashsplit, but only collect the head values in the fragments";@- Range Split@milcommand rangesplit ( b:bat[:any_1,:any_2], ranges:int ) 		:bat[:any_2,bat[:any_1,:any_2]] address ALGrangesplitcomment "Split a BAT on tail column in 'ranges' equally sized consecutive 		ranges. Returns a recursive BAT, containing the fragments in the tail, 		the higher-bound of the range in the head. The higher bound of the last 	range is 'nil'.";command urangesplit( b:bat[:any_1,:any_2], ranges:int ) 		:bat[:any_2,bat[:any_1,:oid]] address ALGurangesplitcomment "Same as rangesplit, but only collect the head values in the fragments" ;@+ Common BAT AggregatesThese operations examine a BAT, and compute some simple aggregate resultover it.@- BAT size@malmodule aggr;command count( b:bat[:any_1,:any] ) :int address ALGcount_batcomment "Return the current size (in number of elements) in a BAT.";command count ( b:bat[:any_1,:any], ignore_nils:bit ) :int address ALGcount_nilcomment "Return the number of elements currently in a BAT ignores 		BUNs with nil-tail iff ignore_nils==TRUE.";command count_no_nil ( b:bat[:any_1,:any_2]) :intaddress ALGcount_no_nilcomment "Return the number of elements currently 	in a BAT ignoring BUNs with nil-tail";@- Histogram on Tail@malcommand histogram ( b:bat[:any_1,:any_2]) :bat[:any_2,:int] address ALGhistogramcomment "Produce a BAT containing the histogram over the tail values.";@- Default Min and MaxImplementations a generic Min and Max routines get declared first. The@emph{ min()} and @emph{ max()} routines below catch any tail-type.The type-specific routines defined later are faster, and willoverride these any implementations.@malcommand cardinality( b:bat[:any_1,:any_2] ) :lng address ALGcardcomment "Return the cardinality of the BAT tail values.";@- Implementations a generic Min and Max routines get declared first. The@emph{ min()} and @emph{ max()} routines below catch any tail-type.The type-specific routines defined later are faster, and willoverride these any implementations.@- @malcommand min(b:bat[:any_1,:any_2]):any_2 address ALGminanycomment "Return the lowest tail value or nil.";command max(b:bat[:any_1,:any_2]):any_2 address ALGmaxanycomment "Return the highest tail value or nil.";@+ Type-Specific Sum, Prod, Max and MinFor X in @{ sht,int,flt,dbl,lng @},  we generate theaggregate functions using a macro.@-@= sum_definitioncommand sum (b:bat[:any_1,:@1] ) :@2 address ALGsum_@1_@2comment "Gives the sum of all tail values";command product(b:bat[:any_1,:@1] ) :@2 address ALGprod_@1_@2comment "Gives the product of all tail values";@mal@:sum_definition(sht,sht)@@:sum_definition(sht,int)@@:sum_definition(sht,lng)@@:sum_definition(int,int)@@:sum_definition(int,lng)@@:sum_definition(lng,lng)@@:sum_definition(flt,flt)@@:sum_definition(flt,dbl)@@:sum_definition(dbl,dbl)@@= avg_definitioncommand avg (b:bat[:any_1,:@1] ) :dbl address ALGavg_@1comment "Gives the avg of all tail values";@mal@:avg_definition(sht)@@:avg_definition(int)@@:avg_definition(lng)@@:avg_definition(flt)@@:avg_definition(dbl)@@-@= aggregate_definitioncommand @1 ( b:bat[:any_1,:@2] ) :@2 address ALG@1_@2 comment @3;@= aggregate@:aggregate_definition(@1,sht,@2)@@:aggregate_definition(@1,int,@2)@@:aggregate_definition(@1,flt,@2)@@:aggregate_definition(@1,dbl,@2)@@:aggregate_definition(@1,lng,@2)@@mal@:aggregate(max,"Give the highest tail value.")@@:aggregate(min,"Give the lowest tail value. ")@@+ Exented selection predicatesFor SQL convenience we provide a serie of interval selectors.@malmodule algebra;@+ Modeling With PropertiesThe Monet kernel performs run-time optimizations. To choose betweenalternative algorithms in a sensible way, it maintains knowledge abouteach BAT, sometimes as a BAT property, sometimes as twocolumn properties for each column (head and tail)of a BAT. An example of the former is size(bat):int(which gives the number of BUNs in a BAT), an exampleof the latter is ordered(column) :bit, indicatingwhether the column contains its valued stored in ascending order.The convention is to use a BAT as operand also for the columnproperties; which then is supposed to be valid for the headcolumn (ordered(BAT)). Tail columns can be described byusing the mirror BAT with the minus operator (ordered(-BAT)).@table @code@item[ordered(BAT) :bit]	TRUE if the head column is stored in ascending order, else FALSE.@item[keyed(BAT) :bit]	TRUE if no duplicates are present in the head column, else FALSE.@item[idx(BAT) :bit]	TRUE if a binary index tree search accelerator is present on	the head column of the BAT, else FALSE.@item[hashtab(BAT) :bit] 		presence of hash table on the head column of		a BAT. TRUE if a bucket-chained hash table search accelerator is		present on the head column of the BAT, else FALSE.@item[subcol(BAT, BAT) :bit]	TRUE if the bag of all values in the head column of the left BAT is	a bag-subset of the bag of all values in the head column of the	right BAT, else FALSE.@item[sync(BAT) :oid]	Sync-OID on the head column of a BAT. A sync-OID denotes some unique	sequence of values. If two columns have the same sync-OID, then they	are guaranteed to contain the same values, in the same sequence.@item[size(BAT) :int]	The (estimated) length of a column.@item[unique(BAT) :int]	The (estimated) number of distinct values in one column.@item[subset(BAT, BAT) :bit]	TRUE if the left BAT is a subset of the BUNs of the right BAT,	else FALSE.@item[setunique(BAT) :bit]	TRUE if the BAT contains no duplicate BUNs, else FALSE.@end table@- Property Propagation RulesAt database creation time, the properties of the BATs in the databasecan be derived directly from the database schema.When queries are executed, they will produce @emph{intermediate results},which in terms are operands for further execution. Hence it is necessaryto @emph{propagate properties} from the operands of an algebraic operator,to its result.This process can be captured by having a series of @emph{propagation rules}for each algebraic operand. Since each algebraic operands may applydifferent strategies, according to different status in its operand properties,each algebraic operator may have different propagation rules with thesedifferent situations as conditions.@include kprelude.mx@h#ifndef ALGEBRA_H#define ALGEBRA_H#include <gdk.h>#include "mal_exception.h"#ifdef WIN32#ifndef LIBALGEBRA#define algebra_export extern __declspec(dllimport)#else#define algebra_export extern __declspec(dllexport)#endif#else#define algebra_export extern#endifalgebra_export ptr BATmax(BAT *b, ptr aggr);algebra_export ptr BATmin(BAT *b, ptr aggr);algebra_export int doCMDfetch(ptr ret, BAT *b, lng i);algebra_export str ALGprojectNIL(int *ret, int *bid);@= avg_exportalgebra_export str ALGavg_@1(dbl *res, int *bid);@h@:avg_export(sht)@@:avg_export(int)@@:avg_export(lng)@@:avg_export(flt)@@:avg_export(dbl)@#endif@c#include "mal_config.h"#include "algebra.h"@= BATaggrptr BAT@1(BAT *b, ptr aggr) {		int t;		size_t s;		ptr v, x;		BATcheck(b, "BAT@1");		s = BATcount(b);		t = b->ttype;		if (BATtvoid(b)) {			@:aggr@1(chr,loc,simple,chr,void)@		} else {			switch(ATOMstorage(t)) {			case TYPE_chr: @:aggr@1(chr,loc,simple,chr,atom)@ break;			case TYPE_sht: @:aggr@1(sht,loc,simple,sht,atom)@ break;			case TYPE_int: @:aggr@1(int,loc,simple,int,atom)@ break;			case TYPE_flt: @:aggr@1(flt,loc,simple,flt,atom)@ break;			case TYPE_dbl: @:aggr@1(dbl,loc,simple,dbl,atom)@ break;			case TYPE_lng: @:aggr@1(lng,loc,simple,lng,atom)@ break;			default:if (b->tvarsized) {						   @:aggr@1(chr,var,atom,t,atom)@ break;					 } else {						   @:aggr@1(chr,loc,atom,t,atom)@ break;			}		}		}		return x;}@c@:BATaggr(min)@@:BATaggr(max)@@* Command Implementations in CThis module contains just a wrapper implementations; since all describedoperations are part of the GDK kernel.@+ BAT sum operationThe sum aggregate only works for int and float fields.The routines below assumes that the caller knows what typeis large enough to prevent overflow.@= sum_implementationintCMDsum_@1_@2(@2* res, BAT *b){	BUN p,q;	int xx;	@2 result=@3;	BATcheck(b,"BATsum_@1_@2");	BATloopFast(b, p, q, xx) {		@1 *value = (@1*) BUNtloc(b, p);		if (*value == @1_nil) {			result = @2_nil;			break;		} else {			result += *value;		}	}	*res = result;	return GDK_SUCCEED;}@+ BAT prod[uct] operationThe prod[uct] aggregate only works for int and float fields.The routines below assumes that the caller knows what typeis large enough to prevent overflow.@= prod_implementationintCMDprod_@1_@2(@2* res, BAT *b){	BUN p,q;	int xx;	@2 result=@3;	BATcheck(b,"BATprodInt");	BATloopFast(b, p, q, xx) {		@1 *value = (@1*) BUNtloc(b, p);		if (*value == @1_nil) {			result = @2_nil;			break;		} else {			result *= *value;		}	}	*res = result;	return GDK_SUCCEED;}@+ BAT avg operationThe avg aggregate only works for int and float fields.@= avg_implementationintCMDavg_@1(dbl* res, BAT *b){	int cnt = BATcount(b);	dbl result=0.0;	CMDsum_@1_dbl(&result, b);	*res = result/cnt;	return GDK_SUCCEED;}@+ Minimum and MaximumThe routines @`BATmin@5(b) and @`BATmax@5(b) compute the minimum andmaximum value of the tail column of a BAT.Aggregate values are calculated just before they are requested bythe user. They are not maintained continuously, because we expectthem to be used sparsely.@= aggregate_implementation

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -