📄 indices.sgml
字号:
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.45.2.1 2003/11/06 22:21:55 tgl Exp $ --><chapter id="indexes"> <title id="indexes-title">Indexes</title> <indexterm zone="indexes"> <primary>index</primary> </indexterm> <para> Indexes are a common way to enhance database performance. An index allows the database server to find and retrieve specific rows much faster than it could do without an index. But indexes also add overhead to the database system as a whole, so they should be used sensibly. </para> <sect1 id="indexes-intro"> <title>Introduction</title> <para> Suppose we have a table similar to this:<programlisting>CREATE TABLE test1 ( id integer, content varchar);</programlisting> and the application requires a lot of queries of the form<programlisting>SELECT content FROM test1 WHERE id = <replaceable>constant</replaceable>;</programlisting> With no advance preparation, the system would have to scan the entire <structname>test1</structname> table, row by row, to find all matching entries. If there are a lot of rows in <structname>test1</structname> and only a few rows (perhaps only zero or one) that would be returned by such a query, then this is clearly an inefficient method. But if the system has been instructed to maintain an index on the <structfield>id</structfield> column, then it can use a more efficient method for locating matching rows. For instance, it might only have to walk a few levels deep into a search tree. </para> <para> A similar approach is used in most books of non-fiction: terms and concepts that are frequently looked up by readers are collected in an alphabetic index at the end of the book. The interested reader can scan the index relatively quickly and flip to the appropriate page(s), rather than having to read the entire book to find the material of interest. Just as it is the task of the author to anticipate the items that the readers are most likely to look up, it is the task of the database programmer to foresee which indexes would be of advantage. </para> <para> The following command would be used to create the index on the <structfield>id</structfield> column, as discussed:<programlisting>CREATE INDEX test1_id_index ON test1 (id);</programlisting> The name <structname>test1_id_index</structname> can be chosen freely, but you should pick something that enables you to remember later what the index was for. </para> <para> To remove an index, use the <command>DROP INDEX</command> command. Indexes can be added to and removed from tables at any time. </para> <para> Once the index is created, no further intervention is required: the system will update the index when the table is modified, and it will use the index in queries when it thinks this would be more efficient than a sequential table scan. But you may have to run the <command>ANALYZE</command> command regularly to update statistics to allow the query planner to make educated decisions. See <xref linkend="performance-tips"> for information about how to find out whether an index is used and when and why the planner may choose <emphasis>not</emphasis> to use an index. </para> <para> Indexes can also benefit <command>UPDATE</command> and <command>DELETE</command> commands with search conditions. Indexes can moreover be used in join queries. Thus, an index defined on a column that is part of a join condition can significantly speed up queries with joins. </para> <para> When an index is created, the system has to keep it synchronized with the table. This adds overhead to data manipulation operations. Therefore indexes that are non-essential or do not get used at all should be removed. Note that a query or data manipulation command can use at most one index per table. </para> </sect1> <sect1 id="indexes-types"> <title>Index Types</title> <para> <productname>PostgreSQL</productname> provides several index types: B-tree, R-tree, GiST, and Hash. Each index type uses a different algorithm that is best suited to different types of queries. <indexterm> <primary>index</primary> <secondary>B-tree</secondary> </indexterm> <indexterm> <primary>B-tree</primary> <see>index</see> </indexterm> By default, the <command>CREATE INDEX</command> command will create a B-tree index, which fits the most common situations. B-trees can handle equality and range queries on data that can be sorted into some ordering. In particular, the <productname>PostgreSQL</productname> query planner will consider using a B-tree index whenever an indexed column is involved in a comparison using one of these operators: <simplelist type="inline"> <member><literal><</literal></member> <member><literal><=</literal></member> <member><literal>=</literal></member> <member><literal>>=</literal></member> <member><literal>></literal></member> </simplelist> </para> <para> The optimizer can also use a B-tree index for queries involving the pattern matching operators <literal>LIKE</>, <literal>ILIKE</literal>, <literal>~</literal>, and <literal>~*</literal>, <emphasis>if</emphasis> the pattern is anchored to the beginning of the string, e.g., <literal>col LIKE 'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not <literal>col LIKE '%bar'</literal>. However, if your server does not use the C locale you will need to create the index with a special operator class. See <xref linkend="indexes-opclass"> below. </para> <para> <indexterm> <primary>index</primary> <secondary>R-tree</secondary> </indexterm> <indexterm> <primary>R-tree</primary> <see>index</see> </indexterm> R-tree indexes are suited for queries on spatial data. To create an R-tree index, use a command of the form<synopsis>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING RTREE (<replaceable>column</replaceable>);</synopsis> The <productname>PostgreSQL</productname> query planner will consider using an R-tree index whenever an indexed column is involved in a comparison using one of these operators: <simplelist type="inline"> <member><literal><<</literal></member> <member><literal>&<</literal></member> <member><literal>&></literal></member> <member><literal>>></literal></member> <member><literal>@</literal></member> <member><literal>~=</literal></member> <member><literal>&&</literal></member> </simplelist> (Refer to <xref linkend="functions-geometry"> about the meaning of these operators.) </para> <para> <indexterm> <primary>index</primary> <secondary>hash</secondary> </indexterm> <indexterm> <primary>hash</primary> <see>index</see> </indexterm> Hash indexes can only handle simple equality comparisons. The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the <literal>=</literal> operator. The following command is used to create a hash index:<synopsis>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING HASH (<replaceable>column</replaceable>);</synopsis> <note> <para> Testing has shown <productname>PostgreSQL</productname>'s hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. For these reasons, hash index use is presently discouraged. </para> </note> </para> <para> The B-tree index method is an implementation of Lehman-Yao high-concurrency B-trees. The R-tree index method implements standard R-trees using Guttman's quadratic split algorithm. The hash index method is an implementation of Litwin's linear hashing. We mention the algorithms used solely to indicate that all of these index methods are fully dynamic and do not have to be optimized periodically (as is the case with, for example, static hash methods). </para> </sect1> <sect1 id="indexes-multicolumn"> <title>Multicolumn Indexes</title> <indexterm zone="indexes-multicolumn"> <primary>index</primary> <secondary>multicolumn</secondary> </indexterm> <para> An index can be defined on more than one column. For example, if you have a table of this form:<programlisting>CREATE TABLE test2 ( major int, minor int, name varchar);</programlisting> (say, you keep your <filename class="directory">/dev</filename> directory in a database...) and you frequently make queries like<programlisting>SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> AND minor = <replaceable>constant</replaceable>;</programlisting> then it may be appropriate to define an index on the columns <structfield>major</structfield> and <structfield>minor</structfield> together, e.g.,<programlisting>CREATE INDEX test2_mm_idx ON test2 (major, minor);</programlisting> </para> <para> Currently, only the B-tree and GiST implementations support multicolumn indexes. Up to 32 columns may be specified. (This limit can be altered when building <productname>PostgreSQL</productname>; see the file <filename>pg_config_manual.h</filename>.) </para> <para> The query planner can use a multicolumn index for queries that involve the leftmost column in the index definition plus any number of columns listed to the right of it, without a gap. For example, an index on <literal>(a, b, c)</literal> can be used in queries involving all of <literal>a</literal>, <literal>b</literal>, and <literal>c</literal>, or in queries involving both <literal>a</literal> and <literal>b</literal>, or in queries involving only <literal>a</literal>, but not in other combinations. (In a query involving <literal>a</literal> and <literal>c</literal> the planner could choose to use the index for <literal>a</literal>, while treating <literal>c</literal> like an ordinary unindexed column.) Of course, each column must be used with operators appropriate to the index type; clauses that involve other operators will not be considered. </para> <para> Multicolumn indexes can only be used if the clauses involving the indexed columns are joined with <literal>AND</literal>. For instance,<programlisting>SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> OR minor = <replaceable>constant</replaceable>;</programlisting> cannot make use of the index <structname>test2_mm_idx</structname> defined above to look up both columns. (It can be used to look up only the <structfield>major</structfield> column, however.) </para>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -