⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 arch-dev.sgml

📁 关系型数据库 Postgresql 6.5.2
💻 SGML
📖 第 1 页 / 共 5 页
字号:
    <para>     Note that every <literal>Sort</literal> and <literal>SeqScan</literal> node appearing in figure     \ref{plan} has got a <literal>targetlist</literal> but because there was not enough space     only the one for the <literal>MergeJoin</literal> node could be drawn.    </para>    <para>     Another task performed by the planner/optimizer is fixing the     <firstterm>operator ids</firstterm> in the <literal>Expr</literal>     and <literal>Oper</literal> nodes. As     mentioned earlier, <productname>Postgres</productname> supports a variety of different data     types and even user defined types can be used. To be able to maintain     the huge amount of functions and operators it is necessary to store     them in a system table. Each function and operator gets a unique     operator id. According to the types of the attributes used     within the qualifications etc., the appropriate operator ids     have to be used.    </para>   </sect2>  </sect1>  <sect1>   <title>Executor</title>   <para>    The <firstterm>executor</firstterm> takes the plan handed back by the    planner/optimizer and starts processing the top node. In the case of    our example (the query given in example \ref{simple_select}) the top    node is a <literal>MergeJoin</literal> node.    </para>   <para>    Before any merge can be done two tuples have to be fetched (one from    each subplan). So the executor recursively calls itself to    process the subplans (it starts with the subplan attached to    <literal>lefttree</literal>). The new top node (the top node of the left subplan) is a    <literal>SeqScan</literal> node and again a tuple has to be fetched before the node    itself can be processed. The executor calls itself recursively    another time for the subplan attached to <literal>lefttree</literal> of the    <literal>SeqScan</literal> node.   </para>   <para>    Now the new top node is a <literal>Sort</literal> node. As a sort has to be done on    the whole relation, the executor starts fetching tuples    from the <literal>Sort</literal> node's subplan and sorts them into a temporary    relation (in memory or a file) when the <literal>Sort</literal> node is visited for    the first time. (Further examinations of the <literal>Sort</literal> node will    always return just one tuple from the sorted temporary    relation.)   </para>   <para>    Every time the processing of the <literal>Sort</literal> node needs a new tuple the    executor is recursively called for the <literal>SeqScan</literal> node    attached as subplan. The relation (internally referenced by the    value given in the <literal>scanrelid</literal> field) is scanned for the next    tuple. If the tuple satisfies the qualification given by the tree    attached to <literal>qpqual</literal> it is handed back, otherwise the next tuple    is fetched until the qualification is satisfied. If the last tuple of    the relation has been processed a <literal>NULL</literal> pointer is    returned.   </para>   <para>    After a tuple has been handed back by the <literal>lefttree</literal> of the    <literal>MergeJoin</literal> the <literal>righttree</literal> is processed in the same way. If both    tuples are present the executor processes the <literal>MergeJoin</literal>    node. Whenever a new tuple from one of the subplans is needed a    recursive call to the executor is performed to obtain it. If a    joined tuple could be created it is handed back and one complete    processing of the plan tree has finished.    </para>   <para>    Now the described steps are performed once for every tuple, until a    <literal>NULL</literal> pointer is returned for the processing of the    <literal>MergeJoin</literal> node, indicating that we are finished.   </para>  </sect1><!--**************************************************************************************************************************************************************************************************************************************************************************************************\pagebreak\clearpage\section{The Realization of the Having Clause}\label{having_impl}The {\it having clause} has been designed in SQL to be able to use theresults of {\it aggregate functions} within a query qualification. Thehandling of the {\it having clause} is very similar to the handling ofthe {\it where clause}. Both are formulas in first order logic (FOL)that have to evaluate to true for a certain object to be handed back:%\begin{itemize}<step> The formula given in the {\it where clause} is evaluated forevery tuple. If the evaluation returns {\tt true} the tuple isreturned, every tuple not satisfying the qualification is ignored.%<step> In the case of {\it groups} the {\it having clause} is evaluatedfor every group. If the evaluation returns {\tt true} the group istaken into account otherwise it is ignored.\end{itemize}%\subsection{How Aggregate Functions are Implemented}\label{aggregates}Before we can describe how the {\it having clause} is implemented wewill have a look at the implementation of {\it aggregate functions} aslong as they just appear in the {\it targetlist}. Note that {\it aggregatefunctions} are applied to groups so the query must contain a {\itgroup clause}.%\begin{example} \label{having}Here is an example of the usage of the {\it aggregatefunction} {\tt count} which counts the number of part numbers ({\ttpno}) of every group. (The table {\tt sells} is defined in example\ref{supplier}.)%\begin{verbatim}  select sno, count(pno)  from sells  group by sno;\end{verbatim}%\end{example}%A query like the one in example \ref{having} is processed by the usualstages:%\begin{itemize}<step> the parser stage<step> the rewrite system<step> the planner/optimizer <step> the executor\end{itemize}%and in the following sections we will describe what every stage doesto the query in order to obtain the appropriate result.\subsubsection{The Parser Stage}\begin{figure}[ht]\begin{center}\epsfig{figure=figures/parse_having.ps}\caption{{\it Querytree} built up for the query of example \ref{having}}\label{parse_having}\end{center}\end{figure}The parser stage builds up a {\it querytree} containing the {\itwhere} qualification and information about the {\it grouping} that hasto be done (i.e. a list of all attributes to group for is attached tothe field {\tt groupClause}). The main difference to {\it querytrees}built up for queries without {\it aggregate functions} is given in thefield {\tt hasAggs} which is set to {\tt true} and in the {\ittargetlist}. The {\tt expr} field of the second {\tt TLE} node of the{\it targetlist} shown in figure \ref{parse_having} does not pointdirectly to a {\tt VAR} node but to an {\tt Aggreg} node representingthe {\it aggregate function} used in the query.A check is made that every attribute grouped for appears only withoutan {\it aggregate function} in the {\it targetlist} and that everyattribute which appears without an {\it aggregate function} in the{\it targetlist} is grouped for.%\pagebreak \subsubsection{The Rewrite System}The rewriting system does not make any changes to the {\it querytree}as long as the query involves just {\it base tables}. If any {\itviews} are present the query is rewritten to access the tablesspecified in the {\it view definition}.%\subsubsection{Planner/Optimizer}Whenever an {\it aggregate function} is involved in a query (which isindicated by the {\tt hasAggs} flag set to {\tt true}) the plannercreates a {\it plantree} whose top node is an {\tt AGG} node. The {\ittargetlist} is searched for {\it aggregate functions} and for everyfunction that is found, a pointer to the corresponding {\tt Aggreg}node is added to a list which is finally attached to the field {\tt aggs} ofthe {\tt AGG} node. This list is needed by the {\it executor} to know which{\it aggregate functions} are present and have to behandled.The {\tt AGG} node is followed by a {\tt GRP} node. The implementationof the {\it grouping} logic needs a sorted table for its operation sothe {\tt GRP} node is followed by a {\tt SORT} node. The {\tt SORT}operation gets its tuples from a kind of {\tt Scan} node (if noindices are present this will be a simple {\tt SeqScan} node). Anyqualifications present are attached to the {\tt Scan} node. Figure\ref{plan_having} shows the {\it plan} created for the query given inexample \ref{having}.\begin{figure}[ht]\begin{center}\epsfig{figure=figures/plan_having.ps}\caption{{\it Plantree} for the query of example \ref{having}}\label{plan_having}\end{center}\end{figure}Note that every node has its own {\it targetlist} which may differ from theone of the node above or below.  The field {\tt varattno} of every{\tt VAR} node included in a {\it targetlist} contains a number representingthe position of the attribute's value in the tuple of the current node.\pagebreak\clearpage\subsubsection{Executor}The {\it executor} uses the function {\tt execAgg()} to execute {\tt AGG}nodes. As described earlier it uses one main function {\ttExecProcNode} which is called recursively to execute subtrees. Thefollowing steps are performed by {\tt execAgg()}:%\begin{itemize}%<step> The list attached to the field {\tt aggs} of the {\tt AGG} node isexamined and for every {\it aggregate function} included the {\it transitionfunctions} are fetched from a {\it function table}. Calculating thevalue of an {\it aggregate function} is done using three functions:%\begin{itemize}<step> The {\it first transition function} {\tt xfn1} is called with thecurrent value of the attribute the {\it aggregate function} is appliedto and changes its internal state using the attribute's value given asan argument.%<step> The {\it second transition function} {\tt xfn2} is called without anyarguments and changes its internal state only according to internal rules.%<step> The {\it final function} {\tt finalfn} takes the final states of {\ttxfn1} and {\tt xfn2} as arguments and finishes the {\it aggregation}.\end{itemize}%\begin{example} Recall the functions necessary to implement the{\it aggregate function} {\tt avg} building the average over allvalues of an attribute in a group (see section \ref{ext_agg} {\itExtending Aggregates}):%\begin{itemize}%<step> The first transition function {\tt xfn1} has to be a function thattakes the actual value of the attribute {\tt avg} is applied to as anargument and adds it to the internally stored sum of previouscalls.%<step> The second transition function {\tt xfn2} only increases an internalcounter every time it is called. %<step> The final function {\tt finalfn} divides the result of {\tt xfn1} bythe counter of {\tt xfn2} to calculate the average.%\end{itemize}%\end{example}%Note that {\tt xfn2} and {\tt finalfn} may be absent (e.g. for the{\it aggregate function} {\tt sum} which simply sums up all values ofthe given attribute within a group. \\\\{\tt execAgg()} creates an array containing one entry for every {\itaggregate function} found in the list attached to the field {\ttaggs}. The array will hold information needed for the execution ofevery {\it aggregate function} (including the {\it transition functions}described above).%<step> The following steps are executed in a loop as long as there arestill tuples returned by the subplan (i.e. as long as there are stilltuples left in the current group). When there are no tuples leftin the group a {\tt NULL} pointer is returned indicating the end of thegroup.%\begin{itemize}<step> A new tuple from the subplan (i.e. the {\it plan} attached to thefield {\tt lefttree}) is fetched by recursively calling {\ttExecProcNode()} with the subplan as argument.%<step> For every {\it aggregate function} (contained in the array createdbefore) apply the transition functions {\tt xfn1} and {\tt xfn2} tothe values of the appropriate attributes of the current tuple.\end{itemize}%<step> When we get here, all tuples of the current group have beenprocessed and the {\it transition functions} of all {\it aggregatefunctions} have been applied to the values of the attributes. We arenow ready to complete the {\it aggregation} by applying the {\it finalfunction} ({\tt finalfn}) for every {\it aggregate function}.%<step> Store the tuple containing the new values (the results of the{\it aggregate functions}) and hand it back.\end{itemize}%Note that the procedure described above only returns one tuple(i.e. it processes just one group and when the end of the group isdetected it processes the {\it aggregate functions} and hands back onetuple). To retrieve all tuples (i.e. to process all groups) thefunction {\tt execAgg()} has to be called (returning a new tuple everytime) until it returns a {\tt NULL} pointer indicating that there areno groups left to process.\subsection{How the Having Clause is Implemented}The basic idea of the implementation is to attach the {\it operator tree}built for the {\it having clause} to the field {\tt qpqual} ofnode {\tt AGG} (which is the top node of the query tree). Now the executorhas to evaluate the new {\it operator tree} attached to {\tt qpqual} forevery group processed. If the evaluation returns {\tt true} the groupis taken into account otherwise it is ignored and the next group willbe examined. \\\\In order to implement the {\it having clause} a variety of changeshave been made to the following stages:%\begin{itemize}<step> The {\it parser stage} has been modified slightly to build up andtransform an {\it operator tree} for the {\it having clause}.%<step> The {\it rewrite system} has been adapted to be able to use {\itviews} with the {\it having logic}.%<step> The {\it planner/optimizer}  now takes the {\it operator tree} ofthe {\it having clause} and attaches it to the {\tt AGG} node (whichis the top node of the {\it queryplan}).%<step> The {\it executor} has been modified to evaluate the {\it operator tree}(i.e. the internal representation of the {\it havingqualification}) attached to the {\tt AGG} node and the results of the{\it aggregation} are only considered if the evaluation returns {\tt true}.\end{itemize}%In the following sections we will describe the changes made to every

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -