📄 sql.sgml
字号:
<itemizedlist> <listitem> <para> SELECT (σ): extracts <firstterm>tuples</firstterm> from a relation that satisfy a given restriction. Let <parameter>R</parameter> be a table that contains an attribute <parameter>A</parameter>.σ<subscript>A=a</subscript>(R) = {t ∈ R ∣ t(A) = a} where <literal>t</literal> denotes a tuple of <parameter>R</parameter> and <literal>t(A)</literal> denotes the value of attribute <parameter>A</parameter> of tuple <literal>t</literal>. </para> </listitem> <listitem> <para> PROJECT (π): extracts specified <firstterm>attributes</firstterm> (columns) from a relation. Let <classname>R</classname> be a relation that contains an attribute <classname>X</classname>. π<subscript>X</subscript>(<classname>R</classname>) = {t(X) ∣ t ∈ <classname>R</classname>}, where <literal>t</literal>(<classname>X</classname>) denotes the value of attribute <classname>X</classname> of tuple <literal>t</literal>. </para> </listitem> <listitem> <para> PRODUCT (×): builds the Cartesian product of two relations. Let <classname>R</classname> be a table with arity <literal>k</literal><subscript>1</subscript> and let <classname>S</classname> be a table with arity <literal>k</literal><subscript>2</subscript>. <classname>R</classname> × <classname>S</classname> is the set of all <literal>k</literal><subscript>1</subscript> + <literal>k</literal><subscript>2</subscript>-tuples whose first <literal>k</literal><subscript>1</subscript> components form a tuple in <classname>R</classname> and whose last <literal>k</literal><subscript>2</subscript> components form a tuple in <classname>S</classname>. </para> </listitem> <listitem> <para> UNION (∪): builds the set-theoretic union of two tables. Given the tables <classname>R</classname> and <classname>S</classname> (both must have the same arity), the union <classname>R</classname> ∪ <classname>S</classname> is the set of tuples that are in <classname>R</classname> or <classname>S</classname> or both. </para> </listitem> <listitem> <para> INTERSECT (∩): builds the set-theoretic intersection of two tables. Given the tables <classname>R</classname> and <classname>S</classname>, <classname>R</classname> ∩ <classname>S</classname> is the set of tuples that are in <classname>R</classname> and in <classname>S</classname>. We again require that <classname>R</classname> and <classname>S</classname> have the same arity. </para> </listitem> <listitem> <para> DIFFERENCE (− or ∖): builds the set difference of two tables. Let <classname>R</classname> and <classname>S</classname> again be two tables with the same arity. <classname>R</classname> - <classname>S</classname> is the set of tuples in <classname>R</classname> but not in <classname>S</classname>. </para> </listitem> <listitem> <para> JOIN (∏): connects two tables by their common attributes. Let <classname>R</classname> be a table with the attributes <classname>A</classname>,<classname>B</classname> and <classname>C</classname> and let <classname>S</classname> be a table with the attributes <classname>C</classname>,<classname>D</classname> and <classname>E</classname>. There is one attribute common to both relations, the attribute <classname>C</classname>. <!-- <classname>R</classname> ∏ <classname>S</classname> = π<subscript><classname>R</classname>.<classname>A</classname>,<classname>R</classname>.<classname>B</classname>,<classname>R</classname>.<classname>C</classname>,<classname>S</classname>.<classname>D</classname>,<classname>S</classname>.<classname>E</classname></subscript>(σ<subscript><classname>R</classname>.<classname>C</classname>=<classname>S</classname>.<classname>C</classname></subscript>(<classname>R</classname> × <classname>S</classname>)).--> R ∏ S = π<subscript>R.A,R.B,R.C,S.D,S.E</subscript>(σ<subscript>R.C=S.C</subscript>(R × S)). What are we doing here? We first calculate the Cartesian product <classname>R</classname> × <classname>S</classname>. Then we select those tuples whose values for the common attribute <classname>C</classname> are equal (σ<subscript>R.C = S.C</subscript>). Now we have a table that contains the attribute <classname>C</classname> two times and we correct this by projecting out the duplicate column. </para> <example> <title id="join-example">An Inner Join</title> <para> Let's have a look at the tables that are produced by evaluating the steps necessary for a join. Let the following two tables be given: <programlisting>R: S: A | B | C C | D | E---+---+--- ---+---+--- 1 | 2 | 3 3 | a | b 4 | 5 | 6 6 | c | d 7 | 8 | 9 </programlisting> </para> </example> <para> First we calculate the Cartesian product <classname>R</classname> × <classname>S</classname> and get: <programlisting>R x S: A | B | R.C | S.C | D | E---+---+-----+-----+---+--- 1 | 2 | 3 | 3 | a | b 1 | 2 | 3 | 6 | c | d 4 | 5 | 6 | 3 | a | b 4 | 5 | 6 | 6 | c | d 7 | 8 | 9 | 3 | a | b 7 | 8 | 9 | 6 | c | d </programlisting> </para> <para> After the selection σ<subscript>R.C=S.C</subscript>(R × S) we get: <programlisting> A | B | R.C | S.C | D | E---+---+-----+-----+---+--- 1 | 2 | 3 | 3 | a | b 4 | 5 | 6 | 6 | c | d </programlisting> </para> <para> To remove the duplicate column <classname>S</classname>.<classname>C</classname> we project it out by the following operation: π<subscript>R.A,R.B,R.C,S.D,S.E</subscript>(σ<subscript>R.C=S.C</subscript>(R × S)) and get: <programlisting> A | B | C | D | E---+---+---+---+--- 1 | 2 | 3 | a | b 4 | 5 | 6 | c | d </programlisting> </para> </listitem> <listitem> <para> DIVIDE (÷): Let <classname>R</classname> be a table with the attributes A, B, C, and D and let <classname>S</classname> be a table with the attributes C and D. Then we define the division as: <programlisting>R ÷ S = {t ∣ ∀ t<subscript>s</subscript> ∈ S ∃ t<subscript>r</subscript> ∈ R </programlisting> such thatt<subscript>r</subscript>(A,B)=t∧t<subscript>r</subscript>(C,D)=t<subscript>s</subscript>} where t<subscript>r</subscript>(x,y) denotes a tuple of table <classname>R</classname> that consists only of the components <literal>x</literal> and <literal>y</literal>. Note that the tuple <literal>t</literal> only consists of the components <classname>A</classname> and <classname>B</classname> of relation <classname>R</classname>. </para> <para id="divide-example"> Given the following tables <programlisting>R: S: A | B | C | D C | D---+---+---+--- ---+--- a | b | c | d c | d a | b | e | f e | f b | c | e | f e | d | c | d e | d | e | f a | b | d | e </programlisting> R ÷ S is derived as <programlisting> A | B---+--- a | b e | d </programlisting> </para> </listitem> </itemizedlist> </para> <para> For a more detailed description and definition of the relational algebra refer to [<xref linkend="ULL88" endterm="ULL88">] or [<xref linkend="DATE04" endterm="DATE04">]. </para> <example> <title id="suppl-rel-alg">A Query Using Relational Algebra</title> <para> Recall that we formulated all those relational operators to be able to retrieve data from the database. Let's return to our example from the previous section (<xref linkend="operations" endterm="operations">) where someone wanted to know the names of all suppliers that sell the part <literal>Screw</literal>. This question can be answered using relational algebra by the following operation: <programlisting>π<subscript>SUPPLIER.SNAME</subscript>(σ<subscript>PART.PNAME='Screw'</subscript>(SUPPLIER ∏ SELLS ∏ PART)) </programlisting> </para> <para> We call such an operation a query. If we evaluate the above query against the our example tables (<xref linkend="supplier-fig" endterm="supplier-fig">) we will obtain the following result: <programlisting> SNAME------- Smith Adams </programlisting> </para> </example> </sect2> <sect2 id="rel-calc"> <title>Relational Calculus</title> <para> The relational calculus is based on the <firstterm>first order logic</firstterm>. There are two variants of the relational calculus: <itemizedlist> <listitem> <para> The <firstterm>Domain Relational Calculus</firstterm> (<acronym>DRC</acronym>), where variables stand for components (attributes) of the tuples. </para> </listitem> <listitem> <para> The <firstterm>Tuple Relational Calculus</firstterm> (<acronym>TRC</acronym>), where variables stand for tuples. </para> </listitem> </itemizedlist> </para> <para> We want to discuss the tuple relational calculus only because it is the one underlying the most relational languages. For a detailed discussion on <acronym>DRC</acronym> (and also <acronym>TRC</acronym>) see <xref linkend="DATE04" endterm="DATE04"> or <xref linkend="ULL88" endterm="ULL88">. </para> </sect2> <sect2> <title>Tuple Relational Calculus</title> <para> The queries used in <acronym>TRC</acronym> are of the following form: <programlisting>x(A) ∣ F(x) </programlisting> where <literal>x</literal> is a tuple variable <classname>A</classname> is a set of attributes and <literal>F</literal> is a formula. The resulting relation consists of all tuples <literal>t(A)</literal> that satisfy <literal>F(t)</literal>. </para> <para> If we want to answer the question from example <xref linkend="suppl-rel-alg" endterm="suppl-rel-alg"> using <acronym>TRC</acronym> we formulate the following query: <programlisting>{x(SNAME) ∣ x ∈ SUPPLIER ∧ ∃ y ∈ SELLS ∃ z ∈ PART (y(SNO)=x(SNO) ∧ z(PNO)=y(PNO) ∧ z(PNAME)='Screw')} </programlisting> </para> <para> Evaluating the query against the tables from <xref linkend="supplier-fig" endterm="supplier-fig"> again leads to the same result as in <xref linkend="suppl-rel-alg" endterm="suppl-rel-alg">. </para> </sect2> <sect2 id="alg-vs-calc"> <title>Relational Algebra vs. Relational Calculus</title> <para> The relational algebra and the relational calculus have the same <firstterm>expressive power</firstterm>; i.e. all queries that can be formulated using relational algebra can also be formulated using the relational calculus and vice versa. This was first proved by E. F. Codd in 1972. This proof is based on an algorithm (<quote>Codd's reduction algorithm</quote>) by which an arbitrary expression of the relational calculus can be reduced to a semantically equivalent expression of relational algebra. For a more detailed discussion on that refer to <xref linkend="DATE04" endterm="DATE04"> and <xref linkend="ULL88" endterm="ULL88">. </para> <para> It is sometimes said that languages based on the relational calculus are <quote>higher level</quote> or <quote>more declarative</quote> than languages based on relational algebra because the algebra (partially) specifies the order of operations while the calculus leaves it to a compiler or interpreter to determine the most efficient order of evaluation. </para> </sect2> </sect1> <sect1 id="sql-language"> <title>The <acronym>SQL</acronym> Language</title> <para> As is the case with most modern relational languages, <acronym>SQL</acronym> is based on the tuple relational calculus. As a result every query that can be formulated using the tuple relational calculus (or equivalently, relational algebra) can also be formulated using <acronym>SQL</acronym>. There are, however, capabilities beyond the scope of relational algebra or calculus. Here is a list of some additional features provided by <acronym>SQL</acronym> that are not part of relational algebra or calculus: <itemizedlist> <listitem> <para> Commands for insertion, deletion or modification of data. </para> </listitem> <listitem> <para> Arithmetic capability: In <acronym>SQL</acronym> it is possible to involve arithmetic operations as well as comparisons, e.g. <programlisting>A < B + 3. </programlisting> Note that + or other arithmetic operators appear neither in relational algebra nor in relational calculus. </para> </listitem> <listitem> <para> Assignment and Print Commands: It is possible to print a
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -