📄 rules.sgml
字号:
<Para> There are only a few differences between a parsetree for a SELECT and one for any other command. Obviously they have another commandtype and this time the resultrelation points to the rangetable entry where the result should go. Anything else is absolutely the same. So having two tables t1 and t2 with attributes a and b, the parsetrees for the two statements<ProgramListing> SELECT t2.b FROM t1, t2 WHERE t1.a = t2.a; UPDATE t1 SET b = t2.b WHERE t1.a = t2.a;</ProgramListing> are nearly identical. <ItemizedList> <ListItem> <Para> The rangetables contain entries for the tables t1 and t2. </Para> </ListItem> <ListItem> <Para> The targetlists contain one variable that points to attribute b of the rangetable entry for table t2. </Para> </ListItem> <ListItem> <Para> The qualification expressions compare the attributes a of both ranges for equality. </Para> </ListItem> </ItemizedList> The consequence is, that both parsetrees result in similar execution plans. They are both joins over the two tables. For the UPDATE the missing columns from t1 are added to the targetlist by the optimizer and the final parsetree will read as<ProgramListing> UPDATE t1 SET a = t1.a, b = t2.b WHERE t1.a = t2.a;</ProgramListing> and thus the executor run over the join will produce exactly the same result set as a<ProgramListing> SELECT t1.a, t2.b FROM t1, t2 WHERE t1.a = t2.a;</ProgramListing> will do. But there is a little problem in UPDATE. The executor does not care what the results from the join it is doing are meant for. It just produces a result set of rows. The difference that one is a SELECT command and the other is an UPDATE is handled in the caller of the executor. The caller still knows (looking at the parsetree) that this is an UPDATE, and he knows that this result should go into table t1. But which of the 666 rows that are there has to be replaced by the new row? The plan executed is a join with a qualification that potentially could produce any number of rows between 0 and 666 in unknown order.</Para><Para> To resolve this problem, another entry is added to the targetlist in UPDATE and DELETE statements. The current tuple ID (ctid). This is a system attribute with a special feature. It contains the block and position in the block for the row. Knowing the table, the ctid can be used to find one specific row in a 1.5GB sized table containing millions of rows by fetching one single data block. After adding the ctid to the targetlist, the final result set could be defined as<ProgramListing> SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;</ProgramListing> Now another detail of <ProductName>Postgres</ProductName> enters the stage. At this moment, table rows aren't overwritten and this is why ABORT TRANSACTION is fast. In an UPDATE, the new result row is inserted into the table (after stripping ctid) and in the tuple header of the row that ctid pointed to the cmax and xmax entries are set to the current command counter and current transaction ID. Thus the old row is hidden and after the transaction commited the vacuum cleaner can really move it out.</Para><Para> Knowing that all, we can simply apply view rules in absolutely the same way to any command. There is no difference.</Para></Sect2><Sect2><Title>The Power of Views in <ProductName>Postgres</ProductName></Title><Para> The above demonstrates how the rule system incorporates view definitions into the original parsetree. In the second example a simple SELECT from one view created a final parsetree that is a join of 4 tables (unit is used twice with different names).</Para><Sect3><Title>Benefits</Title><Para> The benefit of implementing views with the rule system is, that the optimizer has all the information about which tables have to be scanned plus the relationships between these tables plus the restrictive qualifications from the views plus the qualifications from the original query in one single parsetree. And this is still the situation when the original query is already a join over views. Now the optimizer has to decide which is the best path to execute the query. The more information the optimizer has, the better this decision can be. And the rule system as implemented in <ProductName>Postgres</ProductName> ensures, that this is all information available about the query up to now.</Para></Sect3><Sect3><Title>Concerns</Title><Para> There was a long time where the <ProductName>Postgres</ProductName> rule system was considered broken. The use of rules was not recommended and the only part working where view rules. And also these view rules made problems because the rule system wasn't able to apply them properly on other statements than a SELECT (for example an UPDATE that used data from a view didn't work).</Para><Para> During that time, development moved on and many features where added to the parser and optimizer. The rule system got more and more out of sync with their capabilities and it became harder and harder to start fixing it. Thus, noone did.</Para><Para> For 6.4, someone locked the door, took a deep breath and shuffled that damned thing up. What came out was a rule system with the capabilities described in this document. But there are still some constructs not handled and some where it fails due to things that are currently not supported by the <ProductName>Postgres</ProductName> query optimizer. <ItemizedList> <ListItem> <Para> Views with aggregate columns have bad problems. Aggregate expressions in qualifications must be used in subselects. Currently it is not possible to do a join of two views, each having an aggregate column, and compare the two aggregate values in the qualification. In the meantime it is possible to put these aggregate expressions into functions with the appropriate arguments and use them in the view definition. </Para> </ListItem> <ListItem> <Para> Views of unions are currently not supported. Well it's easy to rewrite a simple SELECT into a union. But it is a little difficult if the view is part of a join doing an update. </Para> </ListItem> <ListItem> <Para> ORDER BY clauses in view definitions aren't supported. </Para> </ListItem> <ListItem> <Para> DISTINCT isn't supported in view definitions. </Para> </ListItem> </ItemizedList> There is no good reason why the optimizer should not handle parsetree constructs that the parser could never produce due to limitations in the <Acronym>SQL</Acronym> syntax. The author hopes that these items disappear in the future.</Para></Sect3></Sect2><Sect2><Title>Implementation Side Effects</Title><Para> Using the described rule system to implement views has a funny side effect. The following does not seem to work:<ProgramListing> al_bundy=> INSERT INTO shoe (shoename, sh_avail, slcolor) al_bundy-> VALUES ('sh5', 0, 'black'); INSERT 20128 1 al_bundy=> SELECT shoename, sh_avail, slcolor FROM shoe_data; shoename |sh_avail|slcolor ----------+--------+---------- sh1 | 2|black sh3 | 4|brown sh2 | 0|black sh4 | 3|brown (4 rows)</ProgramListing> The interesting thing is that the return code for INSERT gave us an object ID and told that 1 row has been inserted. But it doesn't appear in <Filename>shoe_data</Filename>. Looking into the database directory we can see, that the database file for the view relation <Filename>shoe</Filename> seems now to have a data block. And that is definitely the case.</Para><Para> We can also issue a DELETE and if it does not have a qualification, it tells us that rows have been deleted and the next vacuum run will reset the file to zero size.</Para><Para> The reason for that behaviour is, that the parsetree for the INSERT does not reference the <Filename>shoe</Filename> relation in any variable. The targetlist contains only constant values. So there is no rule to apply and it goes down unchanged into execution and the row is inserted. And so for the DELETE.</Para><Para> To change this we can define rules that modify the behaviour of non-SELECT queries. This is the topic of the next section.</Para></Sect2></Sect1><Sect1><Title>Rules on INSERT, UPDATE and DELETE</Title><Sect2><Title>Differences to View Rules</Title><Para> Rules that are defined ON INSERT, UPDATE and DELETE are totally different from the view rules described in the previous section. First, their CREATE RULE command allows more: <ItemizedList> <ListItem> <Para> They can have no action. </Para> </ListItem> <ListItem> <Para> They can have multiple actions. </Para> </ListItem> <ListItem> <Para> The keyword INSTEAD is optional. </Para> </ListItem> <ListItem> <Para> The pseudo relations NEW and OLD become useful. </Para> </ListItem> <ListItem> <Para> They can have rule qualifications. </Para> </ListItem> </ItemizedList> Second, they don't modify the parsetree in place. Instead they create zero or many new parsetrees and can throw away the original one.</Para></sect2><Sect2><Title>How These Rules Work</Title><Para> Keep the syntax<ProgramListing> CREATE RULE rule_name AS ON event TO object [WHERE rule_qualification] DO [INSTEAD] [action | (actions) | NOTHING];</ProgramListing> in mind. In the following, "update rules" means rules that are defined ON INSERT, UPDATE or DELETE.</Para><Para> Update rules get applied by the rule system when the result relation and the commandtype of a parsetree are equal to the object and event given in the CREATE RULE command. For update rules, the rule system creates a list of parsetrees. Initially the parsetree list is empty. There can be zero (NOTHING keyword), one or multiple actions. To simplify, we look at a rule with one action. This rule can have a qualification or not and it can be INSTEAD or not.</Para><Para> What is a rule qualification? It is a restriction that tells when the actions of the rule should be done and when not. This qualification can only reference the NEW and/or OLD pseudo relations which are basically the relation given as object (but with a special meaning).</Para><Para> So we have four cases that produce the following parsetrees for a one-action rule.</Para><Para> <ItemizedList> <ListItem> <Para> No qualification and not INSTEAD: <ItemizedList> <ListItem> <Para> The parsetree from the rule action where the original parsetrees qualification has been added. </Para> </ListItem> </ItemizedList> </Para> </ListItem> <ListItem> <Para> No qualification but INSTEAD:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -