📄 tempfiles.html
字号:
<p>Queries that contain subqueries must sometime evaluatethe subqueries separately and store the results in a temporarytable, then use the content of the temporary table to evaluatethe outer query.We call this "materializing" the subquery.The query optimizer in SQLite attempts to avoid materializing,but sometimes it is not easily avoidable.The temporary tables created by materialization are each storedin their own separate temporary file, which is automaticallydeleted at the conclusion of the query.The size of these temporary tables depends on the amount ofdata in the materialization of the subquery, of course.</p><p>A subquery on the right-hand side of IN operator must oftenbe materialized. For example:</p><blockquote><pre>SELECT * FROM ex1 WHERE ex1.a IN (SELECT b FROM ex2);</pre></blockquote><p>In the query above, the subquery "SELECT b FROM ex2" is evaluatedand its results are stored in a temporary table (actually a temporaryindex) that allows one to determine whether or not a value ex2.bexists using a simple binary search. Once this table is constructed,the outer query is run and for each prospective result row a checkis made to see if ex1.a is contained within the temporary table.The row is output only if the check is true.</p><p>To avoid creating the temporary table, the query might be rewrittenas follows:</p><blockquote><pre>SELECT * FROM ex1 WHERE EXISTS(SELECT 1 FROM ex2 WHERE ex2.b=ex1.a);</pre></blockquote><p>Recent versions of SQLite (version 3.5.4 and later)will do this rewrite automaticallyif an index exists on the column ex2.b.</p><p>If the right-hand side of an IN operator can be list of valuesas in the following:</p><blockquote><pre>SELECT * FROM ex1 WHERE a IN (1,2,3);</pre></blockquote><p>List values on the right-hand side of IN are treated as a subquery that must be materialized. In other words, theprevious statement acts as if it were:</p><blockquote><pre>SELECT * FROM ex1 WHERE a IN (SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3);</pre></blockquote><p>A temporary index is always used to hold the values of theright-hand side of an IN operator when that right-hand sideis a list of values.</p><p>Subqueries might also need to be materialized when they appearin the FROM clause of a SELECT statement. For example:</p><blockquote><pre>SELECT * FROM ex1 JOIN (SELECT b FROM ex2) AS t ON t.b=ex1.a;</pre></blockquote><p>Depending on the query, SQLite might need to materialize the "(SELECT b FROM ex2)" subquery into a temporary table, thenperform the join between ex1 and the temporary table. Thequery optimizer tries to avoid this by "flattening" thequery. In the previous example the query can be flattened,and SQLite will automatically transform the query into</p><blockquote><pre>SELECT ex1.*, ex2.b FROM ex1 JOIN ex2 ON ex2.b=ex1.a;</blockquote></pre><p>More complex queries may or may not be able to employ queryflattening to avoid the temporary table. Whether or notthe query can be flattened depends on such factors as whetheror not the subquery or outer query contain aggregate functions,ORDER BY or GROUP BY clauses, LIMIT clauses, and so forth.The rules for when a query and an cannot be flattened arevery complex and are beyond the scope of this document.</p><h3>2.6 Transient Indices</h3><p>SQLite may make use of transient indices toimplement SQL language features such as:</p><ul><li>An ORDER BY or GROUP BY clause</li><li>The DISTINCT keyword in an aggregate query</li><li>Compound SELECT statements joined by UNION, EXCEPT, or INTERSECT</li></ul><p>Each transient index is stored in its own temporary file.The temporary file for a transient index is automatically deletedat the end of the statement that uses it.</p><p>SQLite strives to implement ORDER BY clauses using a preexistingindex. If an appropriate index already exists, SQLite will walkthe index, rather than the underlying table, to extract the requested information, and thus cause the rows to come out inthe desired order. But if SQLite cannot find an appropriate indexit will evaluate the query and store each row in a transient indexwhose data is the row data and whose key is the ORDER BY terms.After the query is evaluated, SQLite goes back and walks thetransient index from beginning to end in order to output therows in the desired order.</p><p>SQLite implements GROUP BY by ordering the output rows in theorder suggested by the GROUP BY terms. Each output row iscompared to the previous to see if it starts a new "group".The ordering by GROUP BY terms is done in exactly the same wayas the ordering by ORDER BY terms. A preexisting index is usedif possible, but if no suitable index is available, a transientindex is created.</p><p>The previous two paragraphs describe the implementation of SQLiteas of version 3.5.8. There are known problems with this approachfor very large results sets - result sets that are larger than theavailable disk cache. Future versions of SQLite will likely addressthis deficiency by completely reworking the sort algorithm for cases when no suitable preexisting sort index is available. Thenew sort algorithm will also use temporary files, but not in thesame way as the current implementation, the the temporary filesfor the new implementation will probably not be index files.</p><p>The DISTINCT keyword on an aggregate query is implemented bycreating an transient index in a temporary file and storingeach result row in that index. As new result rows are computeda check is made to see if they already exist in the transientindex and if they do the new result row is discarded.</p><p>The UNION operator for compound queries is implemented by creatinga transient index in a temporary file and storing the resultsof the left and right subquery in the transient index, discardingduplicates. After both subqueries have been evaluated, thetransient index is walked from beginning to end to generate the final output.</p><p>The EXCEPT operator for compound queries is implemented by creatinga transient index in a temporary file, storing the results of theleft subquery in this transient index, then removing the result from right subquery from the transient index, and finally walkingthe index from beginning to end to obtain the final output.</p><p>The INTERSECT operator for compound queries is implemented bycreating two separate transient indices, each in a separatetemporary file. The left and right subqueries are evaluatedeach into a separate transient index. Then the two indicesare walked together and entries that appear in both indicesare output.</p><p>Note that the UNION ALL operator for compound queries does notuse transient indices by itself (though of course the rightand left subqueries of the UNION ALL might use transient indicesdepending on how they are composed.)<h3>2.7 Transient Database Used By <a href="lang_vacuum.html">VACUUM</a></h3><p>The <a href="lang_vacuum.html">VACUUM</a> command works by creating a temporary fileand then rebuilding the entire database into that temporaryfile. Then the content of the temporary file is copied backinto the original database file and the temporary file isdeleted.</p><p>The temporary file created by the <a href="lang_vacuum.html">VACUUM</a> command exists onlyfor the duration of the command itself. The size of the temporaryfile will be no larger than the original database.</p><h2>3.0 The TEMP_STORE Compile-Time Parameter and Pragma</h2><p>The rollback journal, master journal,and statement journal files are always writtento disk.But the other kinds of temporary files might be stored in memoryonly and never written to disk.Whether or not temporary files other than the rollback,master, and statement journals are written to disk or stored only in memorydepends on the TEMP_STORE compile-time parameter, the<a href="pragma.html#pragma_temp_store">temp_store</a> <a href="pragma.html#syntax">PRAGMA</a>,and on the size of the temporary file.</p><p>The TEMP_STORE compile-time parameter is a #define whose value isan integer between 0 and 3, inclusive. The meaning of theTEMP_STORE compile-time parameter is as follows:</p><ol type="1"><li value="0">Temporary files are always stored on disk regardless of the settingof the <a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</li><li value="1">Temporary files are stored on disk by default but this can beoverridden by the <a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</li><li value="2">Temporary files are stored in memory by default but this can beoverridden by the <a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</li><li value="3">Temporary files are always stored in memory regardless of the settingof the <a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</li></ol><p>The default value of the TEMP_STORE compile-time parameter is 1,which means to store temporary files on disk but provide the optionof overriding the behavior using the<a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</p><p>The <a href="pragma.html#pragma_temp_store">temp_store pragma</a> hasan integer value which also influences the decision of where to storetemporary files. The values of the the temp_store pragma have thefollowing meanings:</p><ol type="1"><li value="0">Use either disk or memory storage for temporary files as determinedby the TEMP_STORE compile-time parameter.</li><li value="1">If the TEMP_STORE compile-time parameter specifies memory storage fortemporary files, then override that decision and use disk storage instead.Otherwise follow the recommendation of the TEMP_STORE compile-timeparameter.</li><li value="2">If the TEMP_STORE compile-time parameter specifies disk storage fortemporary files, then override that decision and use memory storage instead.Otherwise follow the recommendation of the TEMP_STORE compile-timeparameter.</li></ol><p>The default setting for the <a href="pragma.html#pragma_temp_store">temp_store pragma</a> is 0,which means to following the recommendation of TEMP_STORE compile-timeparameter.</p><p>To reiterate, the TEMP_STORE compile-time parameter an the <a href="pragma.html#pragma_temp_store">temp_store pragma</a> onlyinfluence the temporary files other than the rollback journaland the master journal. The rollback journal and the masterjournal are always written to disk regardless of the settings ofthe TEMP_STORE compile-time parameter and the<a href="pragma.html#pragma_temp_store">temp_store pragma</a>.</p><h2>4.0 Other Temporary File Optimizations</h2><p>SQLite uses a page cache of recently read and written databasepages. This page cache is used not just for the main databasefile but also for transient indices and tables stored in temporaryfiles. If SQLite needs to use a temporary index or table andthe TEMP_STORE compile-time parameter and the<a href="pragma.html#pragma_temp_store">temp_store pragma</a> areset to store temporary tables and index on disk, the informationis still initially stored in memory in the page cache. The temporary file is not opened and the information is not trulywritten to disk until the page cache is full.</p><p>This means that for many common cases where the temporary tablesand indices are small (small enough to fit into the page cache)no temporary files are created and no disk I/O occurs. Onlywhen the temporary data becomes too large to fit in RAM doesthe information spill to disk.</p><p>Each temporary table and index is given its own page cachewhich can store a maximum number of database pages determinedby the SQLITE_DEFAULT_TEMP_CACHE_SIZE compile-time parameter.(The default value is 500 pages.)The maximum number of database pages in the page cache is thesame for every temporary table and index. The value cannotbe changed at run-time or on a per-table or per-index basis.Each temporary file gets its own private page cache with itsown SQLITE_DEFAULT_TEMP_CACHE_SIZE page limit.</p><hr><small><i>This page last modified 2008/04/26 11:22:07 UTC</i></small></div></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -