⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 maintenance.sgml

📁 PostgreSQL7.4.6 for Linux
💻 SGML
📖 第 1 页 / 共 2 页
字号:
<!--$Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.27.2.1 2003/11/04 09:45:29 petere Exp $--><chapter id="maintenance"> <title>Routine Database Maintenance Tasks</title> <indexterm zone="maintenance">  <primary>maintenance</primary> </indexterm>  <para>   There are a few routine maintenance chores that must be performed on   a regular basis to keep a <productname>PostgreSQL</productname>   server running smoothly.  The tasks discussed here are repetitive   in nature and can easily be automated using standard Unix tools such   as <application>cron</application> scripts.  But it is the database   administrator's responsibility to set up appropriate scripts, and to   check that they execute successfully.  </para>  <para>   One obvious maintenance task is creation of backup copies of the data on a   regular schedule.  Without a recent backup, you have no chance of recovery   after a catastrophe (disk failure, fire, mistakenly dropping a critical   table, etc.).  The backup and recovery mechanisms available in   <productname>PostgreSQL</productname> are discussed at length in   <xref linkend="backup">.  </para>  <para>   The other main category of maintenance task is periodic <quote>vacuuming</>   of the database.  This activity is discussed in   <xref linkend="routine-vacuuming">.  </para>  <para>   Something else that might need periodic attention is log file management.   This is discussed in <xref linkend="logfile-maintenance">.  </para>  <para>   <productname>PostgreSQL</productname> is low-maintenance compared   to some other database management systems.  Nonetheless,   appropriate attention to these tasks will go far towards ensuring a   pleasant and productive experience with the system.  </para> <sect1 id="routine-vacuuming">  <title>Routine Vacuuming</title>  <indexterm zone="routine-vacuuming">   <primary>vacuum</primary>  </indexterm>  <para>   <productname>PostgreSQL</productname>'s <command>VACUUM</> command   must be run on a regular basis for several reasons:    <orderedlist>     <listitem>      <simpara>To recover disk space occupied by updated or deleted      rows.</simpara>     </listitem>     <listitem>      <simpara>To update data statistics used by the      <productname>PostgreSQL</productname> query planner.</simpara>     </listitem>     <listitem>      <simpara>To protect against loss of very old data due to      <firstterm>transaction ID wraparound</>.</simpara>     </listitem>    </orderedlist>   The frequency and scope of the <command>VACUUM</> operations performed for each of   these reasons will vary depending on the needs of each site.   Therefore, database administrators must understand these issues and   develop an appropriate maintenance strategy.  This section concentrates   on explaining the high-level issues; for details about command syntax   and so on, see the <command>VACUUM</> command reference page.  </para>  <para>   Beginning in <productname>PostgreSQL</productname> 7.2, the standard form   of <command>VACUUM</> can run in parallel with normal database operations   (selects, inserts, updates, deletes, but not changes to table definitions).   Routine vacuuming is therefore not nearly as intrusive as it was in prior   releases, and it's not as critical to try to schedule it at low-usage   times of day.  </para>  <sect2 id="vacuum-for-space-recovery">   <title>Recovering disk space</title>   <indexterm zone="vacuum-for-space-recovery">    <primary>disk space</primary>   </indexterm>   <para>    In normal <productname>PostgreSQL</productname> operation, an    <command>UPDATE</> or <command>DELETE</> of a row does not    immediately remove the old version of the row.    This approach is necessary to gain the benefits of multiversion    concurrency control (see <xref linkend="mvcc">): the row version    must not be deleted while it is still potentially visible to other    transactions. But eventually, an outdated or deleted row version is no    longer of interest to any transaction. The space it occupies must be    reclaimed for reuse by new rows, to avoid infinite growth of disk    space requirements. This is done by running <command>VACUUM</>.   </para>   <para>    Clearly, a table that receives frequent updates or deletes will need    to be vacuumed more often than tables that are seldom updated. It    may be useful to set up periodic <application>cron</> tasks that    vacuum only selected tables, skipping tables that are known not to    change often. This is only likely to be helpful if you have both    large heavily-updated tables and large seldom-updated tables --- the    extra cost of vacuuming a small table isn't enough to be worth    worrying about.   </para>   <para>    The standard form of <command>VACUUM</> is best used with the goal of    maintaining a fairly level steady-state usage of disk space.  The standard    form finds old row versions and makes their space available for re-use within    the table, but it does not try very hard to shorten the table file and    return disk space to the operating system.  If you need to return disk    space to the operating system you can use <command>VACUUM FULL</> ---    but what's the point of releasing disk space that will only have to be    allocated again soon?  Moderately frequent standard <command>VACUUM</> runs    are a better approach than infrequent <command>VACUUM FULL</> runs for    maintaining heavily-updated tables.   </para>   <para>    Recommended practice for most sites is to schedule a database-wide    <command>VACUUM</> once a day at a low-usage time of day, supplemented    by more frequent vacuuming of heavily-updated tables if necessary.    (If you have multiple databases in a cluster, don't forget to    vacuum each one; the program <filename>vacuumdb</> may be helpful.)    Use plain <command>VACUUM</>, not <command>VACUUM FULL</>, for routine    vacuuming for space recovery.   </para>   <para>    <command>VACUUM FULL</> is recommended for cases where you know you have    deleted the majority of rows in a table, so that the steady-state size    of the table can be shrunk substantially with <command>VACUUM FULL</>'s    more aggressive approach.   </para>   <para>    If you have a table whose contents are deleted completely every so often,    consider doing it with <command>TRUNCATE</> rather than using    <command>DELETE</> followed by <command>VACUUM</>.   </para>  </sect2>  <sect2 id="vacuum-for-statistics">   <title>Updating planner statistics</title>   <indexterm zone="vacuum-for-statistics">    <primary>statistics</primary>    <secondary>of the planner</secondary>   </indexterm>   <indexterm zone="vacuum-for-statistics">    <primary>ANALYZE</primary>   </indexterm>   <para>    The <productname>PostgreSQL</productname> query planner relies on    statistical information about the contents of tables in order to    generate good plans for queries.  These statistics are gathered by    the <command>ANALYZE</> command, which can be invoked by itself or    as an optional step in <command>VACUUM</>.  It is important to have    reasonably accurate statistics, otherwise poor choices of plans may    degrade database performance.   </para>   <para>    As with vacuuming for space recovery, frequent updates of statistics    are more useful for heavily-updated tables than for seldom-updated    ones. But even for a heavily-updated table, there may be no need for    statistics updates if the statistical distribution of the data is    not changing much. A simple rule of thumb is to think about how much    the minimum and maximum values of the columns in the table change.    For example, a <type>timestamp</type> column that contains the time    of row update will have a constantly-increasing maximum value as    rows are added and updated; such a column will probably need more    frequent statistics updates than, say, a column containing URLs for    pages accessed on a website. The URL column may receive changes just    as often, but the statistical distribution of its values probably    changes relatively slowly.   </para>   <para>    It is possible to run <command>ANALYZE</> on specific tables and even    just specific columns of a table, so the flexibility exists to update some    statistics more frequently than others if your application requires it.    In practice, however, the usefulness of this feature is doubtful.    Beginning in <productname>PostgreSQL</productname> 7.2,    <command>ANALYZE</> is a fairly fast operation even on large tables,    because it uses a statistical random sampling of the rows of a table    rather than reading every single row.  So it's probably much simpler    to just run it over the whole database every so often.   </para>   <tip>    <para>     Although per-column tweaking of <command>ANALYZE</> frequency may not be     very productive, you may well find it worthwhile to do per-column     adjustment of the level of detail of the statistics collected by     <command>ANALYZE</>.  Columns that are heavily used in <literal>WHERE</> clauses     and have highly irregular data distributions may require a finer-grain     data histogram than other columns.  See <command>ALTER TABLE SET     STATISTICS</>.    </para>   </tip>   <para>    Recommended practice for most sites is to schedule a database-wide    <command>ANALYZE</> once a day at a low-usage time of day; this can    usefully be combined with a nightly <command>VACUUM</>.  However,    sites with relatively slowly changing table statistics may find that    this is overkill, and that less-frequent <command>ANALYZE</> runs    are sufficient.   </para>  </sect2>  <sect2 id="vacuum-for-wraparound">   <title>Preventing transaction ID wraparound failures</title>   <indexterm zone="vacuum-for-wraparound">    <primary>transaction ID</primary>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -