📄 sql.sgml

📁 PostgreSQL 8.1.4的源码适用于Linux下的开源数据库系统
💻 SGML
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
<!--$PostgreSQL: pgsql/doc/src/sgml/sql.sgml,v 1.38 2005/08/01 20:31:05 tgl Exp $--> <chapter id="sql-intro">  <title>SQL</title>  <abstract>   <para>    This chapter introduces the mathematical concepts behind    relational databases. It is not required reading, so if you bog    down or want to get straight to some simple examples feel free to    jump ahead to the next chapter and come back when you have more    time and patience. This stuff is supposed to be fun!   </para>   <para>    This material originally appeared as a part of    Stefan Simkovics' Master's Thesis    (<xref linkend="SIM98" endterm="SIM98">).   </para>  </abstract>  <para>   <acronym>SQL</acronym> has become the most popular relational query    language.   The name <quote><acronym>SQL</acronym></quote> is an abbreviation for   <firstterm>Structured Query Language</firstterm>.    In 1974 Donald Chamberlin and others defined the   language SEQUEL (<firstterm>Structured English Query    Language</firstterm>) at IBM   Research. This language was first implemented in an IBM   prototype called SEQUEL-XRM in 1974-75. In 1976-77 a revised version   of SEQUEL called SEQUEL/2 was defined and the name was changed to   <acronym>SQL</acronym>   subsequently.  </para>  <para>   A new prototype called System R was developed by IBM in 1977. System R   implemented a large subset of SEQUEL/2 (now <acronym>SQL</acronym>)   and a number of   changes were made to <acronym>SQL</acronym> during the project.   System R was installed in   a number of user sites, both internal IBM sites and also some selected   customer sites. Thanks to the success and acceptance of System R at   those user sites IBM started to develop commercial products that   implemented the <acronym>SQL</acronym> language based on the System   R technology.  </para>  <para>   Over the next years IBM and also a number of other vendors announced   <acronym>SQL</acronym> products such as   <productname>SQL/DS</productname> (IBM),   <productname>DB2</productname> (IBM),   <productname>ORACLE</productname> (Oracle Corp.),   <productname>DG/SQL</productname> (Data General Corp.),   and <productname>SYBASE</productname> (Sybase Inc.).  </para>  <para>   <acronym>SQL</acronym> is also an official standard now. In 1982   the American National   Standards Institute (<acronym>ANSI</acronym>) chartered its   Database Committee X3H2 to   develop a proposal for a standard relational language. This proposal   was ratified in 1986 and consisted essentially of the IBM dialect of   <acronym>SQL</acronym>. In 1987 this <acronym>ANSI</acronym>    standard was also accepted as an international   standard by the International Organization for Standardization   (<acronym>ISO</acronym>).   This original standard version of <acronym>SQL</acronym> is often   referred to,   informally, as <quote><abbrev>SQL/86</abbrev></quote>. In 1989 the original   standard was extended   and this new standard is often, again informally, referred to as   <quote><abbrev>SQL/89</abbrev></quote>. Also in 1989, a related standard called   <firstterm>Database Language Embedded <acronym>SQL</acronym></firstterm>   (<acronym>ESQL</acronym>) was developed.  </para>  <para>   The <acronym>ISO</acronym> and <acronym>ANSI</acronym> committees   have been working for many years on the   definition of a greatly expanded version of the original standard,   referred to informally as <firstterm><acronym>SQL2</acronym></firstterm>   or <firstterm><acronym>SQL/92</acronym></firstterm>. This version became a   ratified standard - <quote>International Standard ISO/IEC 9075:1992,   Database Language <acronym>SQL</acronym></quote> - in late 1992.   <acronym>SQL/92</acronym> is the version   normally meant when people refer to <quote>the <acronym>SQL</acronym>   standard</quote>. A detailed   description of <acronym>SQL/92</acronym> is given in    <xref linkend="DATE97" endterm="DATE97">. At the time of   writing this document a new standard informally referred to   as <firstterm><acronym>SQL3</acronym></firstterm>   is under development. It is planned to make <acronym>SQL</acronym>   a Turing-complete   language, i.e. all computable queries (e.g. recursive queries) will be   possible. This has now been completed as SQL:2003.  </para>  <sect1 id="rel-model">   <title>The Relational Data Model</title>  <para>    As mentioned before, <acronym>SQL</acronym> is a relational    language. That means it is    based on the <firstterm>relational data model</firstterm>    first published by E.F. Codd in    1970. We will give a formal description of the relational model    later (in    <xref linkend="formal-notion" endterm="formal-notion">)    but first we want to have a look at it from a more intuitive    point of view.  </para>  <para>    A <firstterm>relational database</firstterm> is a database that is     perceived by its    users as a <firstterm>collection of tables</firstterm> (and    nothing else but tables).    A table consists of rows and columns where each row represents a    record and each column represents an attribute of the records    contained in the table.    <xref linkend="supplier-fig" endterm="supplier-fig">    shows an example of a database consisting of three tables:    <itemizedlist>     <listitem>      <para>       SUPPLIER is a table storing the number       (SNO), the name (SNAME) and the city (CITY) of a supplier.      </para>     </listitem>     <listitem>      <para>       PART is a table storing the number (PNO) the name (PNAME) and       the price (PRICE) of a part.      </para>     </listitem>     <listitem>      <para>       SELLS stores information about which part (PNO) is sold by which       supplier (SNO).        It serves in a sense to connect the other two tables together.      </para>     </listitem>    </itemizedlist>    <example>     <title id="supplier-fig">The Suppliers and Parts Database</title>     <programlisting>SUPPLIER:                   SELLS: SNO |  SNAME  |  CITY       SNO | PNO----+---------+--------     -----+----- 1  |  Smith  | London        1  |  1 2  |  Jones  | Paris         1  |  2 3  |  Adams  | Vienna        2  |  4 4  |  Blake  | Rome          3  |  1                              3  |  3                              4  |  2PART:                         4  |  3 PNO |  PNAME  |  PRICE       4  |  4----+---------+--------- 1  |  Screw  |   10 2  |  Nut    |    8 3  |  Bolt   |   15 4  |  Cam    |   25     </programlisting>    </example>   </para>   <para>    The tables PART and SUPPLIER may be regarded as    <firstterm>entities</firstterm> and    SELLS may be regarded as a <firstterm>relationship</firstterm>    between a particular    part and a particular supplier.    </para>   <para>    As we will see later, <acronym>SQL</acronym> operates on tables    like the ones just    defined but before that we will study the theory of the relational    model.   </para>  </sect1>  <sect1 id="relmodel-formal">   <title id="formal-notion">Relational Data Model Formalities</title>   <para>    The mathematical concept underlying the relational model is the    set-theoretic <firstterm>relation</firstterm> which is a subset of    the Cartesian    product of a list of domains. This set-theoretic relation gives    the model its name (do not confuse it with the relationship from the    <firstterm>Entity-Relationship model</firstterm>).    Formally a domain is simply a set of    values. For example the set of integers is a domain. Also the set of    character strings of length 20 and the real numbers are examples of    domains.   </para>   <para><!--\begin{definition}The <firstterm>Cartesian product</firstterm> of domains $D_{1},    D_{2},\ldots, D_{k}$ written\mbox{$D_{1} \times D_{2} \times \ldots \times D_{k}$} is the set ofall $k$-tuples $(v_{1},v_{2},\ldots,v_{k})$ such that \mbox{$v_{1} \inD_{1}, v_{2} \in D_{2}, \ldots, v_{k} \in D_{k}$}.\end{definition}-->    The <firstterm>Cartesian product</firstterm> of domains    <parameter>D<subscript>1</subscript></parameter>,    <parameter>D<subscript>2</subscript></parameter>,    ...    <parameter>D<subscript>k</subscript></parameter>,    written    <parameter>D<subscript>1</subscript></parameter> &times;    <parameter>D<subscript>2</subscript></parameter> &times;    ... &times;    <parameter>D<subscript>k</subscript></parameter>    is the set of all k-tuples    <parameter>v<subscript>1</subscript></parameter>,    <parameter>v<subscript>2</subscript></parameter>,    ...    <parameter>v<subscript>k</subscript></parameter>,    such that    <parameter>v<subscript>1</subscript></parameter> &isin;     <parameter>D<subscript>1</subscript></parameter>,    <parameter>v<subscript>2</subscript></parameter> &isin;     <parameter>D<subscript>2</subscript></parameter>,    ...    <parameter>v<subscript>k</subscript></parameter> &isin;     <parameter>D<subscript>k</subscript></parameter>.   </para>   <para>    For example, when we have<!-- $k=2$, $D_{1}=\{0,1\}$ and$D_{2}=\{a,b,c\}$, then $D_{1} \times D_{2}$ is$\{(0,a),(0,b),(0,c),(1,a),(1,b),(1,c)\}$.-->    <parameter>k</parameter>=2,    <parameter>D<subscript>1</subscript></parameter>=<literal>{0,1}</literal> and    <parameter>D<subscript>2</subscript></parameter>=<literal>{a,b,c}</literal> then    <parameter>D<subscript>1</subscript></parameter> &times;    <parameter>D<subscript>2</subscript></parameter> is    <literal>{(0,a),(0,b),(0,c),(1,a),(1,b),(1,c)}</literal>.   </para>   <para><!--\begin{definition}A Relation is any subset of the Cartesian product of one or moredomains: $R \subseteq$ \mbox{$D_{1} \times D_{2} \times \ldots \times D_{k}$}\end{definition}-->    A Relation is any subset of the Cartesian product of one or more    domains: <parameter>R</parameter> &sube;    <parameter>D<subscript>1</subscript></parameter> &times;    <parameter>D<subscript>2</subscript></parameter> &times;    ... &times;    <parameter>D<subscript>k</subscript></parameter>.   </para>   <para>    For example <literal>{(0,a),(0,b),(1,a)}</literal> is a relation;    it is in fact a subset of    <parameter>D<subscript>1</subscript></parameter> &times;    <parameter>D<subscript>2</subscript></parameter>    mentioned above.   </para>   <para>    The members of a relation are called tuples. Each relation of some    Cartesian product    <parameter>D<subscript>1</subscript></parameter> &times;    <parameter>D<subscript>2</subscript></parameter> &times;    ... &times;    <parameter>D<subscript>k</subscript></parameter>    is said to have arity <literal>k</literal> and is therefore a set    of <literal>k</literal>-tuples.   </para>   <para>    A relation can be viewed as a table (as we already did, remember    <xref linkend="supplier-fig" endterm="supplier-fig"> where    every tuple is represented by a row and every column corresponds to    one component of a tuple. Giving names (called attributes) to the    columns leads to the definition of a     <firstterm>relation scheme</firstterm>.   </para>   <para><!--\begin{definition}A {\it relation scheme} $R$ is a finite set of attributes\mbox{$\{A_{1},A_{2},\ldots,A_{k}\}$}. There is a domain $D_{i}$ foreach attribute $A_{i}, 1 \le i \le k$ where the values of theattributes are taken from. We often write a relation scheme as\mbox{$R(A_{1},A_{2},\ldots,A_{k})$}.\end{definition}-->    A <firstterm>relation scheme</firstterm> <literal>R</literal> is a     finite set of attributes    <parameter>A<subscript>1</subscript></parameter>,    <parameter>A<subscript>2</subscript></parameter>,    ...    <parameter>A<subscript>k</subscript></parameter>.    There is a domain    <parameter>D<subscript>i</subscript></parameter>,    for each attribute    <parameter>A<subscript>i</subscript></parameter>,    1 &lt;= <literal>i</literal> &lt;= <literal>k</literal>,    where the values of the attributes are taken from. We often write    a relation scheme as    <literal>R(<parameter>A<subscript>1</subscript></parameter>,    <parameter>A<subscript>2</subscript></parameter>,    ...    <parameter>A<subscript>k</subscript></parameter>)</literal>.    <note>     <para>      A <firstterm>relation scheme</firstterm> is just a kind of template      whereas a <firstterm>relation</firstterm> is an instance of a      <firstterm>relation       scheme</firstterm>. The relation consists of tuples (and can      therefore be      viewed as a table); not so the relation scheme.     </para>    </note>   </para>   <sect2>    <title id="domains">Domains vs. Data Types</title>    <para>     We often talked about <firstterm>domains</firstterm>     in the last section. Recall that a     domain is, formally, just a set of values (e.g., the set of integers or     the real numbers). In terms of database systems we often talk of     <firstterm>data types</firstterm> instead of domains.     When we define a table we have to make     a decision about which attributes to include. Additionally we     have to decide which kind of data is going to be stored as     attribute values. For example the values of     <classname>SNAME</classname> from the table     <classname>SUPPLIER</classname> will be character strings,     whereas <classname>SNO</classname> will store     integers. We define this by assigning a data type to each     attribute. The type of <classname>SNAME</classname> will be     <type>VARCHAR(20)</type> (this is the <acronym>SQL</acronym> type     for character strings of length &lt;= 20),     the type of <classname>SNO</classname> will be     <type>INTEGER</type>. With the assignment of a data type we also     have selected     a domain for an attribute. The domain of     <classname>SNAME</classname> is the set of all     character strings of length &lt;= 20,     the domain of <classname>SNO</classname> is the set of     all integer numbers.    </para>   </sect2>  </sect1>  <sect1 id="relmodel-oper">   <title id="operations">Operations in the Relational Data Model</title>   <para>    In the previous section    (<xref linkend="formal-notion" endterm="formal-notion">)    we defined the mathematical notion of    the relational model. Now we know how the data can be stored using a    relational data model but we do not know what to do with all these    tables to retrieve something from the database yet. For example somebody    could ask for the names of all suppliers that sell the part    'Screw'. Therefore two rather different kinds of notations for    expressing operations on relations have been defined:    <itemizedlist>     <listitem>      <para>       The <firstterm>Relational Algebra</firstterm> which is an       algebraic notation,       where queries are expressed by applying specialized operators to the       relations.      </para>     </listitem>     <listitem>      <para>       The <firstterm>Relational Calculus</firstterm> which is a       logical notation,       where queries are expressed by formulating some logical restrictions       that the tuples in the answer must satisfy.      </para>    </listitem>    </itemizedlist>   </para>   <sect2>    <title id="rel-alg">Relational Algebra</title>    <para>     The <firstterm>Relational Algebra</firstterm> was introduced by     E. F. Codd in 1972. It consists of a set of operations on relations:
12 3 4 5 下一页
💿 文件大小 14179 K
👤 上传用户 babydog00
📂 所属分类其他数据库
🏷️ 相关标签

#PostgreSQL #Linux #源码 #开源
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -