📄 mvcc.sgml
字号:
<!--$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.52 2005/10/21 01:41:28 tgl Exp $--> <chapter id="mvcc"> <title>Concurrency Control</title> <indexterm> <primary>concurrency</primary> </indexterm> <para> This chapter describes the behavior of the <productname>PostgreSQL</productname> database system when two or more sessions try to access the same data at the same time. The goals in that situation are to allow efficient access for all sessions while maintaining strict data integrity. Every developer of database applications should be familiar with the topics covered in this chapter. </para> <sect1 id="mvcc-intro"> <title>Introduction</title> <indexterm> <primary>MVCC</primary> </indexterm> <para> Unlike traditional database systems which use locks for concurrency control, <productname>PostgreSQL</productname> maintains data consistency by using a multiversion model (Multiversion Concurrency Control, <acronym>MVCC</acronym>). This means that while querying a database each transaction sees a snapshot of data (a <firstterm>database version</firstterm>) as it was some time ago, regardless of the current state of the underlying data. This protects the transaction from viewing inconsistent data that could be caused by (other) concurrent transaction updates on the same data rows, providing <firstterm>transaction isolation</firstterm> for each database session. </para> <para> The main advantage to using the <acronym>MVCC</acronym> model of concurrency control rather than locking is that in <acronym>MVCC</acronym> locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading. </para> <para> Table- and row-level locking facilities are also available in <productname>PostgreSQL</productname> for applications that cannot adapt easily to <acronym>MVCC</acronym> behavior. However, proper use of <acronym>MVCC</acronym> will generally provide better performance than locks. </para> </sect1> <sect1 id="transaction-iso"> <title>Transaction Isolation</title> <indexterm> <primary>transaction isolation</primary> </indexterm> <para> The <acronym>SQL</acronym> standard defines four levels of transaction isolation in terms of three phenomena that must be prevented between concurrent transactions. These undesirable phenomena are: <variablelist> <varlistentry> <term> dirty read <indexterm><primary>dirty read</primary></indexterm> </term> <listitem> <para> A transaction reads data written by a concurrent uncommitted transaction. </para> </listitem> </varlistentry> <varlistentry> <term> nonrepeatable read <indexterm><primary>nonrepeatable read</primary></indexterm> </term> <listitem> <para> A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read). </para> </listitem> </varlistentry> <varlistentry> <term> phantom read <indexterm><primary>phantom read</primary></indexterm> </term> <listitem> <para> A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction. </para> </listitem> </varlistentry> </variablelist> </para> <para> <indexterm> <primary>transaction isolation level</primary> </indexterm> The four transaction isolation levels and the corresponding behaviors are described in <xref linkend="mvcc-isolevel-table">. </para> <table tocentry="1" id="mvcc-isolevel-table"> <title><acronym>SQL</acronym> Transaction Isolation Levels</title> <tgroup cols="4"> <thead> <row> <entry> Isolation Level </entry> <entry> Dirty Read </entry> <entry> Nonrepeatable Read </entry> <entry> Phantom Read </entry> </row> </thead> <tbody> <row> <entry> Read uncommitted </entry> <entry> Possible </entry> <entry> Possible </entry> <entry> Possible </entry> </row> <row> <entry> Read committed </entry> <entry> Not possible </entry> <entry> Possible </entry> <entry> Possible </entry> </row> <row> <entry> Repeatable read </entry> <entry> Not possible </entry> <entry> Not possible </entry> <entry> Possible </entry> </row> <row> <entry> Serializable </entry> <entry> Not possible </entry> <entry> Not possible </entry> <entry> Not possible </entry> </row> </tbody> </tgroup> </table> <para> In <productname>PostgreSQL</productname>, you can request any of the four standard transaction isolation levels. But internally, there are only two distinct isolation levels, which correspond to the levels Read Committed and Serializable. When you select the level Read Uncommitted you really get Read Committed, and when you select Repeatable Read you really get Serializable, so the actual isolation level may be stricter than what you select. This is permitted by the SQL standard: the four isolation levels only define which phenomena must not happen, they do not define which phenomena must happen. The reason that <productname>PostgreSQL</> only provides two isolation levels is that this is the only sensible way to map the standard isolation levels to the multiversion concurrency control architecture. The behavior of the available isolation levels is detailed in the following subsections. </para> <para> To set the transaction isolation level of a transaction, use the command <xref linkend="sql-set-transaction" endterm="sql-set-transaction-title">. </para> <sect2 id="xact-read-committed"> <title>Read Committed Isolation Level</title> <indexterm> <primary>transaction isolation level</primary> <secondary>read committed</secondary> </indexterm> <para> <firstterm>Read Committed</firstterm> is the default isolation level in <productname>PostgreSQL</productname>. When a transaction runs on this isolation level, a <command>SELECT</command> query sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. (However, the <command>SELECT</command> does see the effects of previous updates executed within its own transaction, even though they are not yet committed.) In effect, a <command>SELECT</command> query sees a snapshot of the database as of the instant that that query begins to run. Notice that two successive <command>SELECT</command> commands can see different data, even though they are within a single transaction, if other transactions commit changes during execution of the first <command>SELECT</command>. </para> <para> <command>UPDATE</command>, <command>DELETE</command>, <command>SELECT FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands behave the same as <command>SELECT</command> in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row may have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the <literal>WHERE</> clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation, starting from the updated version of the row. (In the case of <command>SELECT FOR UPDATE</command> and <command>SELECT FOR SHARE</command>, that means it is the updated version of the row that is locked and returned to the client.) </para> <para> Because of the above rule, it is possible for an updating command to see an inconsistent snapshot: it can see the effects of concurrent updating commands that affected the same rows it is trying to update, but it does not see effects of those commands on other rows in the database. This behavior makes Read Committed mode unsuitable for commands that involve complex search conditions. However, it is just right for simpler cases. For example, consider updating bank balances with transactions like<screen>BEGIN;UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;COMMIT;</screen> If two such transactions concurrently try to change the balance of account 12345, we clearly want the second transaction to start from the updated version of the account's row. Because each command is affecting only a predetermined row, letting it see the updated version of the row does not create any troublesome inconsistency. </para> <para> Since in Read Committed mode each new command starts with a new snapshot that includes all transactions committed up to that instant, subsequent commands in the same transaction will see the effects of the committed concurrent transaction in any case. The point at issue here is whether or not within a <emphasis>single</> command we see an absolutely consistent view of the database. </para> <para> The partial transaction isolation provided by Read Committed mode is adequate for many applications, and this mode is fast and simple to use. However, for applications that do complex queries and updates, it may be necessary to guarantee a more rigorously consistent view of the database than the Read Committed mode provides. </para> </sect2> <sect2 id="xact-serializable"> <title>Serializable Isolation Level</title> <indexterm> <primary>transaction isolation level</primary> <secondary>serializable</secondary> </indexterm> <para> The level <firstterm>Serializable</firstterm> provides the strictest transaction isolation. This level emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently. However, applications using this level must be prepared to retry transactions due to serialization failures. </para> <para> When a transaction is on the serializable level, a <command>SELECT</command> query sees only data committed before the transaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrent transactions. (However, the <command>SELECT</command> does see the effects of previous updates executed within its own transaction, even though they are not yet committed.) This is different from Read Committed in that the <command>SELECT</command> sees a snapshot as of the start of the transaction, not as of the start
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -