📄 indices.sgml
字号:
<para> Multicolumn indexes should be used sparingly. Most of the time, an index on a single column is sufficient and saves space and time. Indexes with more than three columns are unlikely to be helpful unless the usage of the table is extremely stylized. </para> </sect1> <sect1 id="indexes-unique"> <title>Unique Indexes</title> <indexterm zone="indexes-unique"> <primary>index</primary> <secondary>unique</secondary> </indexterm> <para> Indexes may also be used to enforce uniqueness of a column's value, or the uniqueness of the combined values of more than one column.<synopsis>CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <optional>, ...</optional>);</synopsis> Currently, only B-tree indexes can be declared unique. </para> <para> When an index is declared unique, multiple table rows with equal indexed values will not be allowed. Null values are not considered equal. A multicolumn unique index will only reject cases where all of the indexed columns are equal in two rows. </para> <para> <productname>PostgreSQL</productname> automatically creates a unique index when a unique constraint or a primary key is defined for a table. The index covers the columns that make up the primary key or unique columns (a multicolumn index, if appropriate), and is the mechanism that enforces the constraint. </para> <note> <para> The preferred way to add a unique constraint to a table is <literal>ALTER TABLE ... ADD CONSTRAINT</literal>. The use of indexes to enforce unique constraints could be considered an implementation detail that should not be accessed directly. One should, however, be aware that there's no need to manually create indexes on unique columns; doing so would just duplicate the automatically-created index. </para> </note> </sect1> <sect1 id="indexes-expressional"> <title>Indexes on Expressions</title> <indexterm zone="indexes-expressional"> <primary>index</primary> <secondary sortas="expressions">on expressions</secondary> </indexterm> <para> An index column need not be just a column of the underlying table, but can be a function or scalar expression computed from one or more columns of the table. This feature is useful to obtain fast access to tables based on the results of computations. </para> <para> For example, a common way to do case-insensitive comparisons is to use the <function>lower</function> function:<programlisting>SELECT * FROM test1 WHERE lower(col1) = 'value';</programlisting> This query can use an index, if one has been defined on the result of the <literal>lower(col1)</literal> operation:<programlisting>CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));</programlisting> </para> <para> If we were to declare this index <literal>UNIQUE</>, it would prevent creation of rows whose <literal>col1</> values differ only in case, as well as rows whose <literal>col1</> values are actually identical. Thus, indexes on expressions can be used to enforce constraints that are not definable as simple unique constraints. </para> <para> As another example, if one often does queries like this:<programlisting>SELECT * FROM people WHERE (first_name || ' ' || last_name) = 'John Smith';</programlisting> then it might be worth creating an index like this:<programlisting>CREATE INDEX people_names ON people ((first_name || ' ' || last_name));</programlisting> </para> <para> The syntax of the <command>CREATE INDEX</> command normally requires writing parentheses around index expressions, as shown in the second example. The parentheses may be omitted when the expression is just a function call, as in the first example. </para> <para> Index expressions are relatively expensive to maintain, since the derived expression(s) must be computed for each row upon insertion or whenever it is updated. Therefore they should be used only when queries that can use the index are very frequent. </para> </sect1> <sect1 id="indexes-opclass"> <title>Operator Classes</title> <indexterm zone="indexes-opclass"> <primary>operator class</primary> </indexterm> <para> An index definition may specify an <firstterm>operator class</firstterm> for each column of an index.<synopsis>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <replaceable>opclass</replaceable> <optional>, ...</optional>);</synopsis> The operator class identifies the operators to be used by the index for that column. For example, a B-tree index on the type <type>int4</type> would use the <literal>int4_ops</literal> class; this operator class includes comparison functions for values of type <type>int4</type>. In practice the default operator class for the column's data type is usually sufficient. The main point of having operator classes is that for some data types, there could be more than one meaningful index behavior. For example, we might want to sort a complex-number data type either by absolute value or by real part. We could do this by defining two operator classes for the data type and then selecting the proper class when making an index. </para> <para> There are also some built-in operator classes besides the default ones: <itemizedlist> <listitem> <para> The operator classes <literal>text_pattern_ops</literal>, <literal>varchar_pattern_ops</literal>, <literal>bpchar_pattern_ops</literal>, and <literal>name_pattern_ops</literal> support B-tree indexes on the types <type>text</type>, <type>varchar</type>, <type>char</type>, and <type>name</type>, respectively. The difference from the ordinary operator classes is that the values are compared strictly character by character rather than according to the locale-specific collation rules. This makes these operator classes suitable for use by queries involving pattern matching expressions (<literal>LIKE</literal> or POSIX regular expressions) if the server does not use the standard <quote>C</quote> locale. As an example, you might index a <type>varchar</type> column like this:<programlisting>CREATE INDEX test_index ON test_table (col varchar_pattern_ops);</programlisting> If you do use the C locale, you may instead create an index with the default operator class, and it will still be useful for pattern-matching queries. Also note that you should create an index with the default operator class if you want queries involving ordinary comparisons to use an index. Such queries cannot use the <literal><replaceable>xxx</replaceable>_pattern_ops</literal> operator classes. It is allowed to create multiple indexes on the same column with different operator classes. </para> </listitem> </itemizedlist> </para> <para> The following query shows all defined operator classes:<programlisting>SELECT am.amname AS index_method, opc.opcname AS opclass_name FROM pg_am am, pg_opclass opc WHERE opc.opcamid = am.oid ORDER BY index_method, opclass_name;</programlisting> It can be extended to show all the operators included in each class:<programlisting>SELECT am.amname AS index_method, opc.opcname AS opclass_name, opr.oprname AS opclass_operator FROM pg_am am, pg_opclass opc, pg_amop amop, pg_operator opr WHERE opc.opcamid = am.oid AND amop.amopclaid = opc.oid AND amop.amopopr = opr.oid ORDER BY index_method, opclass_name, opclass_operator;</programlisting> </para> </sect1> <sect1 id="indexes-partial"> <title>Partial Indexes</title> <indexterm zone="indexes-partial"> <primary>index</primary> <secondary>partial</secondary> </indexterm> <para> A <firstterm>partial index</firstterm> is an index built over a subset of a table; the subset is defined by a conditional expression (called the <firstterm>predicate</firstterm> of the partial index). The index contains entries for only those table rows that satisfy the predicate. </para> <para> A major motivation for partial indexes is to avoid indexing common values. Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up queries that do use the index. It will also speed up many table update operations because the index does not need to be updated in all cases. <xref linkend="indexes-partial-ex1"> shows a possible application of this idea. </para> <example id="indexes-partial-ex1"> <title>Setting up a Partial Index to Exclude Common Values</title> <para> Suppose you are storing web server access logs in a database. Most accesses originate from the IP address range of your organization but some are from elsewhere (say, employees on dial-up connections). If your searches by IP are primarily for outside accesses, you probably do not need to index the IP range that corresponds to your organization's subnet. </para> <para> Assume a table like this:<programlisting>CREATE TABLE access_log ( url varchar, client_ip inet, ...);</programlisting> </para> <para> To create a partial index that suits our example, use a command such as this:<programlisting>CREATE INDEX access_log_client_ip_ix ON access_log (client_ip) WHERE NOT (client_ip > inet '192.168.100.0' AND client_ip < inet '192.168.100.255');</programlisting> </para> <para> A typical query that can use this index would be:<programlisting>SELECT * FROM access_log WHERE url = '/index.html' AND client_ip = inet '212.78.10.32';</programlisting> A query that cannot use this index is:<programlisting>SELECT * FROM access_log WHERE client_ip = inet '192.168.100.23';</programlisting> </para> <para> Observe that this kind of partial index requires that the common values be predetermined. If the distribution of values is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -