📄 queries.sgml
字号:
The actual column set must be specified in the calling query so that the parser knows, for example, what <literal>*</> should expand to. </para> </sect3> </sect2> <sect2 id="queries-where"> <title>The <literal>WHERE</literal> Clause</title> <indexterm zone="queries-where"> <primary>WHERE</primary> </indexterm> <para> The syntax of the <literal>WHERE</> clause is<synopsis>WHERE <replaceable>search_condition</replaceable></synopsis> where <replaceable>search_condition</replaceable> is any value expression as defined in <xref linkend="sql-expressions"> that returns a value of type <type>boolean</type>. </para> <para> After the processing of the <literal>FROM</> clause is done, each row of the derived virtual table is checked against the search condition. If the result of the condition is true, the row is kept in the output table, otherwise (that is, if the result is false or null) it is discarded. The search condition typically references at least some column in the table generated in the <literal>FROM</> clause; this is not required, but otherwise the <literal>WHERE</> clause will be fairly useless. </para> <note> <para> Before the implementation of the <literal>JOIN</> syntax, it was necessary to put the join condition of an inner join in the <literal>WHERE</> clause. For example, these table expressions are equivalent:<programlisting>FROM a, b WHERE a.id = b.id AND b.val > 5</programlisting> and<programlisting>FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5</programlisting> or perhaps even<programlisting>FROM a NATURAL JOIN b WHERE b.val > 5</programlisting> Which one of these you use is mainly a matter of style. The <literal>JOIN</> syntax in the <literal>FROM</> clause is probably not as portable to other SQL database management systems. For outer joins there is no choice in any case: they must be done in the <literal>FROM</> clause. An <literal>ON</>/<literal>USING</> clause of an outer join is <emphasis>not</> equivalent to a <literal>WHERE</> condition, because it determines the addition of rows (for unmatched input rows) as well as the removal of rows from the final result. </para> </note> <para> Here are some examples of <literal>WHERE</literal> clauses:<programlisting>SELECT ... FROM fdt WHERE c1 > 5SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2)SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)</programlisting> <literal>fdt</literal> is the table derived in the <literal>FROM</> clause. Rows that do not meet the search condition of the <literal>WHERE</> clause are eliminated from <literal>fdt</literal>. Notice the use of scalar subqueries as value expressions. Just like any other query, the subqueries can employ complex table expressions. Notice also how <literal>fdt</literal> is referenced in the subqueries. Qualifying <literal>c1</> as <literal>fdt.c1</> is only necessary if <literal>c1</> is also the name of a column in the derived input table of the subquery. But qualifying the column name adds clarity even when it is not needed. This example shows how the column naming scope of an outer query extends into its inner queries. </para> </sect2> <sect2 id="queries-group"> <title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title> <indexterm zone="queries-group"> <primary>GROUP BY</primary> </indexterm> <indexterm zone="queries-group"> <primary>grouping</primary> </indexterm> <para> After passing the <literal>WHERE</> filter, the derived input table may be subject to grouping, using the <literal>GROUP BY</> clause, and elimination of group rows using the <literal>HAVING</> clause. </para><synopsis>SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...</synopsis> <para> The <literal>GROUP BY</> clause is used to group together those rows in a table that share the same values in all the columns listed. The order in which the columns are listed does not matter. The purpose is to reduce each group of rows sharing common values into one group row that is representative of all rows in the group. This is done to eliminate redundancy in the output and/or compute aggregates that apply to these groups. For instance:<screen><prompt>=></> <userinput>SELECT * FROM test1;</> x | y---+--- a | 3 c | 2 b | 5 a | 1(4 rows)<prompt>=></> <userinput>SELECT x FROM test1 GROUP BY x;</> x--- a b c(3 rows)</screen> </para> <para> In the second query, we could not have written <literal>SELECT * FROM test1 GROUP BY x</literal>, because there is no single value for the column <literal>y</> that could be associated with each group. The grouped-by columns can be referenced in the select list since they have a known constant value per group. </para> <para> In general, if a table is grouped, columns that are not used in the grouping cannot be referenced except in aggregate expressions. An example with aggregate expressions is:<screen><prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</> x | sum---+----- a | 4 b | 5 c | 2(3 rows)</screen> Here <literal>sum</literal> is an aggregate function that computes a single value over the entire group. More information about the available aggregate functions can be found in <xref linkend="functions-aggregate">. </para> <tip> <para> Grouping without aggregate expressions effectively calculates the set of distinct values in a column. This can also be achieved using the <literal>DISTINCT</> clause (see <xref linkend="queries-distinct">). </para> </tip> <para> Here is another example: it calculates the total sales for each product (rather than the total sales on all products).<programlisting>SELECT product_id, p.name, (sum(s.units) * p.price) AS sales FROM products p LEFT JOIN sales s USING (product_id) GROUP BY product_id, p.name, p.price;</programlisting> In this example, the columns <literal>product_id</literal>, <literal>p.name</literal>, and <literal>p.price</literal> must be in the <literal>GROUP BY</> clause since they are referenced in the query select list. (Depending on how exactly the products table is set up, name and price may be fully dependent on the product ID, so the additional groupings could theoretically be unnecessary, but this is not implemented yet.) The column <literal>s.units</> does not have to be in the <literal>GROUP BY</> list since it is only used in an aggregate expression (<literal>sum(...)</literal>), which represents the sales of a product. For each product, the query returns a summary row about all sales of the product. </para> <para> In strict SQL, <literal>GROUP BY</> can only group by columns of the source table but <productname>PostgreSQL</productname> extends this to also allow <literal>GROUP BY</> to group by columns in the select list. Grouping by value expressions instead of simple column names is also allowed. </para> <indexterm> <primary>HAVING</primary> </indexterm> <para> If a table has been grouped using a <literal>GROUP BY</literal> clause, but then only certain groups are of interest, the <literal>HAVING</literal> clause can be used, much like a <literal>WHERE</> clause, to eliminate groups from a grouped table. The syntax is:<synopsis>SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable></synopsis> Expressions in the <literal>HAVING</> clause can refer both to grouped expressions and to ungrouped expressions (which necessarily involve an aggregate function). </para> <para> Example:<screen><prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) > 3;</> x | sum---+----- a | 4 b | 5(2 rows)<prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x < 'c';</> x | sum---+----- a | 4 b | 5(2 rows)</screen> </para> <para> Again, a more realistic example:<programlisting>SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit FROM products p LEFT JOIN sales s USING (product_id) WHERE s.date > CURRENT_DATE - INTERVAL '4 weeks' GROUP BY product_id, p.name, p.price, p.cost HAVING sum(p.price * s.units) > 5000;</programlisting> In the example above, the <literal>WHERE</> clause is selecting rows by a column that is not grouped (the expression is only true for sales during the last four weeks), while the <literal>HAVING</> clause restricts the output to groups with total gross sales over 5000. Note that the aggregate expressions do not necessarily need to be the same in all parts of the query. </para> </sect2> </sect1> <sect1 id="queries-select-lists"> <title>Select Lists</title> <indexterm> <primary>SELECT</primary> <secondary>select list</secondary> </indexterm> <para> As shown in the previous section, the table expression in the <command>SELECT</command> command constructs an intermediate virtual table by possibly combining tables, views, eliminating rows, grouping, etc. This table is finally passed on to processing by the <firstterm>select list</firstterm>. The select list determines which <emphasis>columns</emphasis> of the intermediate table are actually output. </para> <sect2 id="queries-select-list-items"> <title>Select-List Items</title> <indexterm> <primary>*</primary> </indexterm> <para> The simplest kind of select list is <literal>*</literal> which emits all columns that the table expression produces. Otherwise, a select list is a comma-separated list of value expressions (as defined in <xref linkend="sql-expressions">). For instance, it could be a list of column names:<programlisting>SELECT a, b, c FROM ...</programlisting> The columns names <literal>a</>, <literal>b</>, and <literal>c</> are either the actual names of the columns of tables referenced in the <literal>FROM</> clause, or the aliases given to them as explained in <xref linkend="queries-table-aliases">. The name space available in the select list is the same as in the <literal>WHERE</> clause, unless grouping is used, in which case it is the same as in the <literal>HAVING</> clause. </para> <para> If more than one table has a column of the same name, the table name must also be given, as in<programlisting>SELECT tbl1.a, tbl2.a, tbl1.b FROM ...</programlisting> When working with multiple tables, it can also be useful to ask for all the columns of a particular table:<programlisting>SELECT tbl1.*, tbl2.a FROM ...</programlisting> (See also <xref linkend="queries-where">.) </para> <para> If an arbitrary value expression is used in the select list, it conceptually adds a new virtual column to the returned table. The value expression is evaluated once for each result row, with
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -