📄 sql.sgml
字号:
of JOINs which are otherwise processed left to right. </para> </sect3> <sect3> <title id="aggregates-tutorial">Aggregate Operators</title> <para> <acronym>SQL</acronym> provides aggregate operators (e.g. AVG, COUNT, SUM, MIN, MAX) that take an expression as argument. The expression is evaluated at each row that satisfies the WHERE clause, and the aggregate operator is calculated over this set of input values. Normally, an aggregate delivers a single result for a whole <command>SELECT</command> statement. But if grouping is specified in the query, then a separate calculation is done over the rows of each group, and an aggregate result is delivered per group (see next section). <example> <title id="aggregates-example">Aggregates</title> <para> If we want to know the average cost of all parts in table PART we use the following query: <programlisting>SELECT AVG(PRICE) AS AVG_PRICE FROM PART; </programlisting> </para> <para> The result is: <programlisting> AVG_PRICE----------- 14.5 </programlisting> </para> <para> If we want to know how many parts are defined in table PART we use the statement: <programlisting>SELECT COUNT(PNO) FROM PART; </programlisting> and get: <programlisting> COUNT------- 4 </programlisting> </para> </example> </para> </sect3> <sect3> <title>Aggregation by Groups</title> <para> <acronym>SQL</acronym> allows one to partition the tuples of a table into groups. Then the aggregate operators described above can be applied to the groups — i.e. the value of the aggregate operator is no longer calculated over all the values of the specified column but over all values of a group. Thus the aggregate operator is evaluated separately for every group. </para> <para> The partitioning of the tuples into groups is done by using the keywords <command>GROUP BY</command> followed by a list of attributes that define the groups. If we have <command>GROUP BY A<subscript>1</subscript>, ⃛, A<subscript>k</subscript></command> we partition the relation into groups, such that two tuples are in the same group if and only if they agree on all the attributes A<subscript>1</subscript>, ⃛, A<subscript>k</subscript>. <example> <title id="aggregates-groupby">Aggregates</title> <para> If we want to know how many parts are sold by every supplier we formulate the query: <programlisting>SELECT S.SNO, S.SNAME, COUNT(SE.PNO) FROM SUPPLIER S, SELLS SE WHERE S.SNO = SE.SNO GROUP BY S.SNO, S.SNAME; </programlisting> and get: <programlisting> SNO | SNAME | COUNT-----+-------+------- 1 | Smith | 2 2 | Jones | 1 3 | Adams | 2 4 | Blake | 3 </programlisting> </para> <para> Now let's have a look of what is happening here. First the join of the tables SUPPLIER and SELLS is derived: <programlisting> S.SNO | S.SNAME | SE.PNO-------+---------+-------- 1 | Smith | 1 1 | Smith | 2 2 | Jones | 4 3 | Adams | 1 3 | Adams | 3 4 | Blake | 2 4 | Blake | 3 4 | Blake | 4 </programlisting> </para> <para> Next we partition the tuples into groups by putting all tuples together that agree on both attributes S.SNO and S.SNAME: <programlisting> S.SNO | S.SNAME | SE.PNO-------+---------+-------- 1 | Smith | 1 | 2-------------------------- 2 | Jones | 4-------------------------- 3 | Adams | 1 | 3-------------------------- 4 | Blake | 2 | 3 | 4 </programlisting> </para> <para> In our example we got four groups and now we can apply the aggregate operator COUNT to every group leading to the final result of the query given above. </para> </example> </para> <para> Note that for a query using GROUP BY and aggregate operators to make sense the target list can only refer directly to the attributes being grouped by. Other attributes may only be used inside the argument of an aggregate function. Otherwise there would not be a unique value to associate with the other attributes. </para> <para> Also observe that it makes no sense to ask for an aggregate of an aggregate, e.g., AVG(MAX(sno)), because a <command>SELECT</command> only does one pass of grouping and aggregation. You can get a result of this kind by using a temporary table or a sub-SELECT in the FROM clause to do the first level of aggregation. </para> </sect3> <sect3> <title>Having</title> <para> The HAVING clause works much like the WHERE clause and is used to consider only those groups satisfying the qualification given in the HAVING clause. Essentially, WHERE filters out unwanted input rows before grouping and aggregation are done, whereas HAVING filters out unwanted group rows post-GROUP. Therefore, WHERE cannot refer to the results of aggregate functions. On the other hand, there's no point in writing a HAVING condition that doesn't involve an aggregate function! If your condition doesn't involve aggregates, you might as well write it in WHERE, and thereby avoid the computation of aggregates for groups that you're just going to throw away anyway. <example> <title id="having-example">Having</title> <para> If we want only those suppliers selling more than one part we use the query: <programlisting>SELECT S.SNO, S.SNAME, COUNT(SE.PNO) FROM SUPPLIER S, SELLS SE WHERE S.SNO = SE.SNO GROUP BY S.SNO, S.SNAME HAVING COUNT(SE.PNO) > 1; </programlisting> and get: <programlisting> SNO | SNAME | COUNT-----+-------+------- 1 | Smith | 2 3 | Adams | 2 4 | Blake | 3 </programlisting> </para> </example> </para> </sect3> <sect3> <title>Subqueries</title> <para> In the WHERE and HAVING clauses the use of subqueries (subselects) is allowed in every place where a value is expected. In this case the value must be derived by evaluating the subquery first. The usage of subqueries extends the expressive power of <acronym>SQL</acronym>. <example> <title id="subselect-example">Subselect</title> <para> If we want to know all parts having a greater price than the part named 'Screw' we use the query: <programlisting>SELECT * FROM PART WHERE PRICE > (SELECT PRICE FROM PART WHERE PNAME='Screw'); </programlisting> </para> <para> The result is: <programlisting> PNO | PNAME | PRICE-----+---------+-------- 3 | Bolt | 15 4 | Cam | 25 </programlisting> </para> <para> When we look at the above query we can see the keyword <command>SELECT</command> two times. The first one at the beginning of the query - we will refer to it as outer <command>SELECT</command> - and the one in the WHERE clause which begins a nested query - we will refer to it as inner <command>SELECT</command>. For every tuple of the outer <command>SELECT</command> the inner <command>SELECT</command> has to be evaluated. After every evaluation we know the price of the tuple named 'Screw' and we can check if the price of the actual tuple is greater. (Actually, in this example the inner query need only be evaluated once, since it does not depend on the state of the outer query.) </para> <para> If we want to know all suppliers that do not sell any part (e.g. to be able to remove these suppliers from the database) we use: <programlisting>SELECT * FROM SUPPLIER S WHERE NOT EXISTS (SELECT * FROM SELLS SE WHERE SE.SNO = S.SNO); </programlisting> </para> <para> In our example the result will be empty because every supplier sells at least one part. Note that we use S.SNO from the outer <command>SELECT</command> within the WHERE clause of the inner <command>SELECT</command>. Here the subquery must be evaluated afresh for each tuple from the outer query, i.e. the value for S.SNO is always taken from the current tuple of the outer <command>SELECT</command>. </para> </example> </para> </sect3> <sect3> <title>Subqueries in FROM</title> <para> A somewhat different way of using subqueries is to put them in the FROM clause. This is a useful feature because a subquery of this kind can output multiple columns and rows, whereas a subquery used in an expression must deliver just a single result. It also lets us get more than one round of grouping/aggregation without resorting to a temporary table. <example> <title id="subselect-in-from-example">Subselect in FROM</title> <para> If we want to know the highest average part price among all our suppliers, we can't write MAX(AVG(PRICE)), but we can write: <programlisting>SELECT MAX(subtable.avgprice) FROM (SELECT AVG(P.PRICE) AS avgprice FROM SUPPLIER S, PART P, SELLS SE WHERE S.SNO = SE.SNO AND P.PNO = SE.PNO GROUP BY S.SNO) subtable; </programlisting> The subquery returns one row per supplier (because of its GROUP BY) and then we aggregate over those rows in the outer query. </para> </example> </para> </sect3> <sect3> <title>Union, Intersect, Except</title> <para> These operations calculate the union, intersection and set theoretic difference of the tuples derived by two subqueries. <example> <title id="union-example">Union, Intersect, Except</title> <para> The following query is an example for UNION: <programlisting>SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNAME = 'Jones'UNION SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNAME = 'Adams'; </programlisting>gives the result: <programlisting> SNO | SNAME | CITY-----+-------+-------- 2 | Jones | Paris 3 | Adams | Vienna </programlisting> </para> <para> Here is an example for INTERSECT: <programlisting>SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNO > 1INTERSECT SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNO < 3; </programlisting> gives the result: <programlisting> SNO | SNAME | CITY-----+-------+-------- 2 | Jones | Paris </programlisting> The only tuple returned by both parts of the query is the one having SNO=2. </para> <para> Finally an example for EXCEPT: <programlisting>SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNO > 1EXCEPT SELECT S.SNO, S.SNAME, S.CITY FROM SUPPLIER S WHERE S.SNO > 3; </programlisting> gives the result: <programlisting> SNO | SNAME | CITY-----+-------+-------- 2 | Jones | Paris 3 | Adams | Vienna </programlisting> </para> </example> </para>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -