📄 maintenance.sgml
字号:
<para> For each table, two conditions are used to determine which operation(s) to apply. If the number of obsolete tuples since the last <command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the table is vacuumed. The vacuum threshold is defined as:<programlisting>vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples</programlisting> where the vacuum base threshold is <xref linkend="guc-autovacuum-vacuum-threshold">, the vacuum scale factor is <xref linkend="guc-autovacuum-vacuum-scale-factor">, and the number of tuples is <structname>pg_class</structname>.<structfield>reltuples</structfield>. The number of obsolete tuples is obtained from the statistics collector; it is a semi-accurate count updated by each <command>UPDATE</command> and <command>DELETE</command> operation. (It is only semi-accurate because some information may be lost under heavy load.) For analyze, a similar condition is used: the threshold, defined as<programlisting>analyze threshold = analyze base threshold + analyze scale factor * number of tuples</programlisting> is compared to the total number of tuples inserted, updated, or deleted since the last <command>ANALYZE</command>. </para> <para> The default thresholds and scale factors are taken from <filename>postgresql.conf</filename>, but it is possible to override them on a table-by-table basis by making entries in the system catalog <link linkend="catalog-pg-autovacuum"><structname>pg_autovacuum</></link>. If a <structname>pg_autovacuum</structname> row exists for a particular table, the settings it specifies are applied; otherwise the global settings are used. See <xref linkend="runtime-config-autovacuum"> for more details on the global settings. </para> <para> Besides the base threshold values and scale factors, there are three more parameters that can be set for each table in <structname>pg_autovacuum</structname>. The first, <structname>pg_autovacuum</>.<structfield>enabled</>, can be set to <literal>false</literal> to instruct the autovacuum daemon to skip that particular table entirely. In this case autovacuum will only touch the table when it vacuums the entire database to prevent transaction ID wraparound. The other two parameters, the vacuum cost delay (<structname>pg_autovacuum</structname>.<structfield>vac_cost_delay</structfield>) and the vacuum cost limit (<structname>pg_autovacuum</structname>.<structfield>vac_cost_limit</structfield>), are used to set table-specific values for the <xref linkend="runtime-config-resource-vacuum-cost" endterm="runtime-config-resource-vacuum-cost-title"> feature. </para> <para> If any of the values in <structname>pg_autovacuum</structname> are set to a negative number, or if a row is not present at all in <structname>pg_autovacuum</structname> for any particular table, the corresponding values from <filename>postgresql.conf</filename> are used. </para> <para> There is not currently any support for making <structname>pg_autovacuum</structname> entries, except by doing manual <command>INSERT</>s into the catalog. This feature will be improved in future releases, and it is likely that the catalog definition will change. </para> <caution> <para> The contents of the <structname>pg_autovacuum</structname> system catalog are currently not saved in database dumps created by the tools <command>pg_dump</command> and <command>pg_dumpall</command>. If you want to preserve them across a dump/reload cycle, make sure you dump the catalog manually. </para> </caution> </sect2> </sect1> <sect1 id="routine-reindex"> <title>Routine Reindexing</title> <indexterm zone="routine-reindex"> <primary>reindex</primary> </indexterm> <para> In some situations it is worthwhile to rebuild indexes periodically with the <xref linkend="sql-reindex" endterm="sql-reindex-title"> command. </para> <para> In <productname>PostgreSQL</> releases before 7.4, periodic reindexing was frequently necessary to avoid <quote>index bloat</>, due to lack of internal space reclamation in B-tree indexes. Any situation in which the range of index keys changed over time — for example, an index on timestamps in a table where old entries are eventually deleted — would result in bloat, because index pages for no-longer-needed portions of the key range were not reclaimed for re-use. Over time, the index size could become indefinitely much larger than the amount of useful data in it. </para> <para> In <productname>PostgreSQL</> 7.4 and later, index pages that have become completely empty are reclaimed for re-use. There is still a possibility for inefficient use of space: if all but a few index keys on a page have been deleted, the page remains allocated. So a usage pattern in which all but a few keys in each range are eventually deleted will see poor use of space. The potential for bloat is not indefinite — at worst there will be one key per page — but it may still be worthwhile to schedule periodic reindexing for indexes that have such usage patterns. </para> <para> The potential for bloat in non-B-tree indexes has not been well characterized. It is a good idea to keep an eye on the index's physical size when using any non-B-tree index type. </para> <para> Also, for B-tree indexes a freshly-constructed index is somewhat faster to access than one that has been updated many times, because logically adjacent pages are usually also physically adjacent in a newly built index. (This consideration does not currently apply to non-B-tree indexes.) It might be worthwhile to reindex periodically just to improve access speed. </para> </sect1> <sect1 id="logfile-maintenance"> <title>Log File Maintenance</title> <indexterm zone="logfile-maintenance"> <primary>server log</primary> <secondary>log file maintenance</secondary> </indexterm> <para> It is a good idea to save the database server's log output somewhere, rather than just routing it to <filename>/dev/null</>. The log output is invaluable when it comes time to diagnose problems. However, the log output tends to be voluminous (especially at higher debug levels) and you won't want to save it indefinitely. You need to <quote>rotate</> the log files so that new log files are started and old ones removed after a reasonable period of time. </para> <para> If you simply direct the <systemitem>stderr</> of the <command>postmaster</command> into a file, you will have log output, but the only way to truncate the log file is to stop and restart the <command>postmaster</command>. This may be OK if you are using <productname>PostgreSQL</productname> in a development environment, but few production servers would find this behavior acceptable. </para> <para> A better approach is to send the <command>postmaster</>'s <systemitem>stderr</> output to some type of log rotation program. There is a built-in log rotation program, which you can use by setting the configuration parameter <literal>redirect_stderr</> to <literal>true</> in <filename>postgresql.conf</>. The control parameters for this program are described in <xref linkend="runtime-config-logging-where">. </para> <para> Alternatively, you might prefer to use an external log rotation program, if you have one that you are already using with other server software. For example, the <application>rotatelogs</application> tool included in the <productname>Apache</productname> distribution can be used with <productname>PostgreSQL</productname>. To do this, just pipe the <command>postmaster</>'s <systemitem>stderr</> output to the desired program. If you start the server with <command>pg_ctl</>, then <systemitem>stderr</> is already redirected to <systemitem>stdout</>, so you just need a pipe command, for example:<programlisting>pg_ctl start | rotatelogs /var/log/pgsql_log 86400</programlisting> </para> <para> Another production-grade approach to managing log output is to send it all to <application>syslog</> and let <application>syslog</> deal with file rotation. To do this, set the configuration parameter <literal>log_destination</> to <literal>syslog</> (to log to <application>syslog</> only) in <filename>postgresql.conf</>. Then you can send a <literal>SIGHUP</literal> signal to the <application>syslog</> daemon whenever you want to force it to start writing a new log file. If you want to automate log rotation, the <application>logrotate</application> program can be configured to work with log files from <application>syslog</application>. </para> <para> On many systems, however, <application>syslog</> is not very reliable, particularly with large log messages; it may truncate or drop messages just when you need them the most. Also, on <productname>Linux</>, <application>syslog</> will sync each message to disk, yielding poor performance. (You can use a <literal>-</> at the start of the file name in the <application>syslog</> configuration file to disable this behavior.) </para> <para> Note that all the solutions described above take care of starting new log files at configurable intervals, but they do not handle deletion of old, no-longer-interesting log files. You will probably want to set up a batch job to periodically delete old log files. Another possibility is to configure the rotation program so that old log files are overwritten cyclically. </para> </sect1></chapter><!-- Keep this comment at the end of the fileLocal variables:mode:sgmlsgml-omittag:nilsgml-shorttag:tsgml-minimize-attributes:nilsgml-always-quote-attributes:tsgml-indent-step:1sgml-indent-data:tsgml-parent-document:nilsgml-default-dtd-file:"./reference.ced"sgml-exposed-tags:nilsgml-local-catalogs:("/usr/lib/sgml/catalog")sgml-local-ecat-files:nilEnd:-->
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -