📄 arch-dev.sgml
字号:
\end{figure}--> </sect2> <sect2> <title>Transformation Process</title> <para> The <firstterm>transformation process</firstterm> takes the tree handed back by the parser as input and steps recursively through it. If a <literal>SelectStmt</literal> node is found, it is transformed to a <literal>Query</literal> node which will be the top most node of the new data structure. Figure \ref{transformed} shows the transformed data structure (the part for the transformed <firstterm>where clause</firstterm> is given in figure \ref{transformed_where} because there was not enough space to show all parts in one figure). </para> <para> Now a check is made, if the <firstterm>relation names</firstterm> in the <firstterm>FROM clause</firstterm> are known to the system. For every relation name that is present in the <firstterm>system catalogs</firstterm> a <abbrev>RTE</abbrev> node is created containing the relation name, the <firstterm>alias name</firstterm> and the <firstterm>relation id</firstterm>. From now on the relation ids are used to refer to the <firstterm>relations</firstterm> given in the query. All <abbrev>RTE</abbrev> nodes are collected in the <firstterm>range table entry list</firstterm> which is connected to the field <literal>rtable</literal> of the <literal>Query</literal> node. If a name of a relation that is not known to the system is detected in the query an error will be returned and the query processing will be aborted. </para> <para> Next it is checked if the <firstterm>attribute names</firstterm> used are contained in the relations given in the query. For every attribute} that is found a <abbrev>TLE</abbrev> node is created holding a pointer to a <literal>Resdom</literal> node (which holds the name of the column) and a pointer to a <literal>VAR</literal> node. There are two important numbers in the <literal>VAR</literal> node. The field <literal>varno</literal> gives the position of the relation containing the current attribute} in the range table entry list created above. The field <literal>varattno</literal> gives the position of the attribute within the relation. If the name of an attribute cannot be found an error will be returned and the query processing will be aborted. </para><!--\begin{figure}[ht]\begin{center}\epsfig{figure=figures/transform.ps}\caption{Transformed {\it TargetList} and {\it FromList} for query of example \ref{simple_select}}\label{transformed}\end{center}\end{figure}\noindent Figure \ref{transformed_where} shows the transformed {\it whereclause}. Every {\tt A\_Expr} node is transformed to an {\tt Expr}node. The {\tt Attr} nodes representing the attributes are replaced by{\tt VAR} nodes as it has been done for the {\it targetlist}above. Checks if the appearing {\it attributes} are valid and known to thesystem are made. If there is an {\it attribute} used whichis not known an error will be returned and the {\it queryprocessing} will be aborted. \\\\The whole {\it transformation process} performs a transformation ofthe data structure handed back by the {\it parser} to a morecomfortable form. The character strings representing the {\itrelations} and {\it attributes} in the original tree are replaced by{\it relation ids} and {\tt VAR} nodes whose fields are referring tothe entries of the {\it range table entry list}. In addition to thetransformation, also various checks if the used {\it relation}and {\it attribute} names (appearing in the query) are valid in thecurrent context are performed.\begin{figure}[ht]\begin{center}\epsfig{figure=figures/transform_where.ps}\caption{Transformed {\it where clause} for query of example \ref{simple_select}}\label{transformed_where}\end{center}\end{figure}\pagebreak\clearpage\begin{figure}[ht]\begin{center}\epsfig{figure=figures/plan.ps}\caption{{\it Plan} for query of example \ref{simple_select}}\label{plan}\end{center}\end{figure}--> </sect2> </sect1> <sect1> <title>The <productname>Postgres</productname> Rule System</title> <para> <productname>Postgres</productname> supports a powerful <firstterm>rule system</firstterm> for the specification of <firstterm>views</firstterm> and ambiguous <firstterm>view updates</firstterm>. Originally the <productname>Postgres</productname> rule system consisted of two implementations: <itemizedlist> <listitem> <para> The first one worked using <firstterm>tuple level</firstterm> processing and was implemented deep in the <firstterm>executor</firstterm>. The rule system was called whenever an individual tuple had been accessed. This implementation was removed in 1995 when the last official release of the <productname>Postgres</productname> project was transformed into <productname>Postgres95</productname>. </para> </listitem> <listitem> <para> The second implementation of the rule system is a technique called <firstterm>query rewriting</firstterm>. The <firstterm>rewrite system</firstterm>} is a module that exists between the <firstterm>parser stage</firstterm> and the <firstterm>planner/optimizer</firstterm>. This technique is still implemented. </para> </listitem> </itemizedlist> </para> <para> For information on the syntax and creation of rules in the <productname>Postgres</productname> system refer to <citetitle>The PostgreSQL User's Guide</citetitle>. </para> <sect2> <title>The Rewrite System</title> <para> The <firstterm>query rewrite system</firstterm> is a module between the parser stage and the planner/optimizer. It processes the tree handed back by the parser stage (which represents a user query) and if there is a rule present that has to be applied to the query it rewrites the tree to an alternate form. </para> <sect3 id="view-impl"> <title>Techniques To Implement Views</title> <para> Now we will sketch the algorithm of the query rewrite system. For better illustration we show how to implement views using rules as an example. </para> <para> Let the following rule be given: <programlisting> create rule view_rule as on select to test_view do instead select s.sname, p.pname from supplier s, sells se, part p where s.sno = se.sno and p.pno = se.pno; </programlisting> </para> <para> The given rule will be <firstterm>fired</firstterm> whenever a select against the relation <literal>test_view</literal> is detected. Instead of selecting the tuples from <literal>test_view</literal> the select statement given in the <firstterm>action part</firstterm> of the rule is executed. </para> <para> Let the following user-query against <literal>test_view</literal> be given: <programlisting> select sname from test_view where sname <> 'Smith'; </programlisting> </para> <para> Here is a list of the steps performed by the query rewrite system whenever a user-query against <literal>test_view</literal> appears. (The following listing is a very informal description of the algorithm just intended for basic understanding. For a detailed description refer to <xref linkend="STON89-full" endterm="STON89">). </para> <procedure> <title><literal>test_view</literal> Rewrite</title> <step> <para> Take the query given in the action part of the rule. </para> </step> <step> <para> Adapt the targetlist to meet the number and order of attributes given in the user-query. </para> </step> <step> <para> Add the qualification given in the where clause of the user-query to the qualification of the query given in the action part of the rule. </para> </step> </procedure> <para> Given the rule definition above, the user-query will be rewritten to the following form (Note that the rewriting is done on the internal representation of the user-query handed back by the parser stage but the derived new data structure will represent the following query): <programlisting> select s.sname from supplier s, sells se, part p where s.sno = se.sno and p.pno = se.pno and s.sname <> 'Smith'; </programlisting> </para></sect3> </sect2> </sect1> <sect1> <title>Planner/Optimizer</title> <para> The task of the <firstterm>planner/optimizer</firstterm> is to create an optimal execution plan. It first combines all possible ways of <firstterm>scanning</firstterm> and <firstterm>joining</firstterm> the relations that appear in a query. All the created paths lead to the same result and it's the task of the optimizer to estimate the cost of executing each path and find out which one is the cheapest. </para> <sect2> <title>Generating Possible Plans</title> <para> The planner/optimizer decides which plans should be generated based upon the types of indices defined on the relations appearing in a query. There is always the possibility of performing a sequential scan on a relation, so a plan using only sequential scans is always created. Assume an index is defined on a relation (for example a B-tree index) and a query contains the restriction <literal>relation.attribute OPR constant</literal>. If <literal>relation.attribute</literal> happens to match the key of the B-tree index and <literal>OPR</literal> is anything but '<>' another plan is created using the B-tree index to scan the relation. If there are further indices present and the restrictions in the query happen to match a key of an index further plans will be considered. </para> <para> After all feasible plans have been found for scanning single relations, plans for joining relations are created. The planner/optimizer considers only joins between every two relations for which there exists a corresponding join clause (i.e. for which a restriction like <literal>where rel1.attr1=rel2.attr2</literal> exists) in the where qualification. All possible plans are generated for every join pair considered by the planner/optimizer. The three possible join strategies are: <itemizedlist> <listitem> <para> <firstterm>nested iteration join</firstterm>: The right relation is scanned once for every tuple found in the left relation. This strategy is easy to implement but can be very time consuming. </para> </listitem> <listitem> <para> <firstterm>merge sort join</firstterm>: Each relation is sorted on the join attributes before the join starts. Then the two relations are merged together taking into account that both relations are ordered on the join attributes. This kind of join is more attractive because every relation has to be scanned only once. </para> </listitem> <listitem> <para> <firstterm>hash join</firstterm>: the right relation is first hashed on its join attributes. Next the left relation is scanned and the appropriate values of every tuple found are used as hash keys to locate the tuples in the right relation. </para> </listitem> </itemizedlist> </para> </sect2> <sect2> <title>Data Structure of the Plan</title> <para> Here we will give a little description of the nodes appearing in the plan. Figure \ref{plan} shows the plan produced for the query in example \ref{simple_select}. </para> <para> The top node of the plan is a <literal>MergeJoin</literal> node which has two successors, one attached to the field <literal>lefttree</literal> and the second attached to the field <literal>righttree</literal>. Each of the subnodes represents one relation of the join. As mentioned above a merge sort join requires each relation to be sorted. That's why we find a <literal>Sort</literal> node in each subplan. The additional qualification given in the query (<literal>s.sno > 2</literal>) is pushed down as far as possible and is attached to the <literal>qpqual</literal> field of the leaf <literal>SeqScan</literal> node of the corresponding subplan. </para> <para> The list attached to the field <literal>mergeclauses</literal> of the <literal>MergeJoin</literal> node contains information about the join attributes. The values <literal>65000</literal> and <literal>65001</literal> for the <literal>varno</literal> fields in the <literal>VAR</literal> nodes appearing in the <literal>mergeclauses</literal> list (and also in the <literal>targetlist</literal>) mean that not the tuples of the current node should be considered but the tuples of the next "deeper" nodes (i.e. the top nodes of the subplans) should be used instead. </para>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -