cluster.7

来自「PostgreSQL 8.2中增加了很多企业用户所需要的功能和性能上的提高,其开」· 7 代码 · 共 149 行

7
149
字号
.\\" auto-generated by docbook2man-spec $Revision: 1.1.1.1 $.TH "CLUSTER" "" "2008-01-03" "SQL - Language Statements" "SQL Commands".SH NAMECLUSTER \- cluster a table according to an index.SH SYNOPSIS.sp.nfCLUSTER \fIindexname\fR ON \fItablename\fRCLUSTER \fItablename\fRCLUSTER.sp.fi.SH "DESCRIPTION".PP\fBCLUSTER\fR instructs PostgreSQL to cluster the table specifiedby \fItablename\fRbased on the index specified by\fIindexname\fR. The index mustalready have been defined on \fItablename\fR..PPWhen a table is clustered, it is physically reorderedbased on the index information. Clustering is a one-time operation:when the table is subsequently updated, the changes arenot clustered. That is, no attempt is made to store new orupdated rows according to their index order. If one wishes, one canperiodically recluster by issuing the command again..PPWhen a table is clustered, PostgreSQLremembers on which index it was clustered. The form\fBCLUSTER \fItablename\fB\fRreclusters the table on the same index that it was clustered before..PP\fBCLUSTER\fR without any parameter reclusters all the tablesin thecurrent database that the calling user owns, or all tables if calledby a superuser. (Never-clustered tables are not included.) Thisform of \fBCLUSTER\fR cannot be executed inside a transactionblock..PPWhen a table is being clustered, an ACCESSEXCLUSIVE lock is acquired on it. This prevents any otherdatabase operations (both reads and writes) from operating on thetable until the \fBCLUSTER\fR is finished..SH "PARAMETERS".TP\fB\fIindexname\fB\fRThe name of an index..TP\fB\fItablename\fB\fRThe name (possibly schema-qualified) of a table..SH "NOTES".PP\fBCLUSTER\fR loses all visibility information of tuples,which makes the table look empty to any snapshot that was takenbefore the \fBCLUSTER\fR command finished. That makes\fBCLUSTER\fR unsuitable for applications wheretransactions that access the table being clustered are run concurrentlywith \fBCLUSTER\fR. This is most visible with serializabletransactions, because they take only one snapshot at the beginning of thetransaction, but read-committed transactions are also affected..PPIn cases where you are accessing single rows randomlywithin a table, the actual order of the data in thetable is unimportant. However, if you tend to access somedata more than others, and there is an index that groupsthem together, you will benefit from using \fBCLUSTER\fR.If you are requesting a range of indexed values from a table, or asingle indexed value that has multiple rows that match,\fBCLUSTER\fR will help because once the index identifies thetable page for the first row that matches, all other rowsthat match are probably already on the same table page,and so you save disk accesses and speed up the query..PPDuring the cluster operation, a temporary copy of the table is createdthat contains the table data in the index order. Temporary copies ofeach index on the table are created as well. Therefore, you need freespace on disk at least equal to the sum of the table size and the indexsizes..PPBecause \fBCLUSTER\fR remembers the clustering information,one can cluster the tables one wants clustered manually the first time, andsetup a timed event similar to \fBVACUUM\fR so that the tablesare periodically reclustered..PPBecause the planner records statistics about the ordering oftables, it is advisable to run ANALYZE [\fBanalyze\fR(7)] on the newly clustered table.Otherwise, the planner may make poor choices of query plans..PPThere is another way to cluster data. The\fBCLUSTER\fR command reorders the original table byscanning it using the index you specify. This can be slowon large tables because the rows are fetched from the tablein index order, and if the table is disordered, theentries are on random pages, so there is one disk pageretrieved for every row moved. (PostgreSQL hasa cache, but the majority of a big table will not fit in the cache.)The other way to cluster a table is to use.sp.nfCREATE TABLE \fInewtable\fR AS    SELECT * FROM \fItable\fR ORDER BY \fIcolumnlist\fR;.sp.fiwhich uses the PostgreSQL sorting codeto produce the desired order;this is usually much faster than an index scan for disordered data.Then you drop the old table, use\fBALTER TABLE ... RENAME\fRto rename \fInewtable\fR to theold name, and recreate the table's indexes.The big disadvantage of this approach is that it does not preserveOIDs, constraints, foreign key relationships, granted privileges, andother ancillary properties of the table \(em all such items must bemanually recreated. Another disadvantage is that this way requires a sorttemporary file about the same size as the table itself, so peak disk usageis about three times the table size instead of twice the table size..SH "EXAMPLES".PPCluster the table employees on the basis ofits index emp_ind:.sp.nfCLUSTER emp_ind ON emp;.sp.fi.PPCluster the employees table using the sameindex that was used before:.sp.nfCLUSTER emp;.sp.fi.PPCluster all tables in the database that have previously been clustered:.sp.nfCLUSTER;.sp.fi.SH "COMPATIBILITY".PPThere is no \fBCLUSTER\fR statement in the SQL standard..SH "SEE ALSO"clusterdb [\fBclusterdb\fR(1)]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?