📄 ch15.htm
字号:
<TD ALIGN="LEFT" VALIGN="TOP">10</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT" VALIGN="TOP"><TT>status_cd = 'A'</TT></TD>
<TD ALIGN="LEFT" VALIGN="TOP">4</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT" VALIGN="TOP"><TT>function_cd = '060'</TT></TD>
<TD VALIGN="TOP">6</TD>
</TR>
</TABLE>
<BLOCKQUOTE>
<P>
<HR>
<FONT COLOR="#000077"><B>NOTE:</B></FONT><B> </B>The most restrictive condition is
also the condition with the most distinct values.
<HR>
</BLOCKQUOTE>
<P>The next example places the most restrictive conditions first in the <TT>WHERE</TT>
clause:
<H5>INPUT:</H5>
<PRE><FONT COLOR="#0066FF">SQL> <B>SET TIMING ON</B>
2 <B>SELECT COUNT(*)</B>
3 <B>FROM FACT_TABLE</B>
4 <B>WHERE CALC_YTD = '-2109490.8'</B>
5 <B> AND DT_STMP = '01-SEP-96'</B>
6 <B>AND OUTPUT_CD = '001'</B>
7 <B>AND ACTIVITY_CD = 'IN'</B>
8 <B> AND STATUS_CD = 'A'</B>
9 <B> AND FUNCTION_CD = '060';</B></FONT></PRE>
<H5><FONT COLOR="#000000">OUTPUT:</FONT></H5>
<PRE><FONT COLOR="#0066FF">COUNT(*)
--------
8
1 row selected.
Elapsed: 00:00:15.37
</FONT></PRE>
<P>This example places the most restrictive conditions last in the <TT>WHERE</TT>
clause:
<H5>INPUT/OUTPUT:</H5>
<PRE><FONT COLOR="#0066FF">SQL> <B>SET TIMING ON</B>
2 <B> SELECT COUNT(*)</B>
3 <B>FROM FACT_TABLE</B>
4 <B> WHERE FUNCTION_CD = '060'</B>
5 <B> AND STATUS_CD = 'A'</B>
6 <B> AND ACTIVITY_CD = 'IN'</B>
7 <B> AND OUTPUT_CD = '001'</B>
8 <B>AND DT_STMP = '01-SEP-96'</B>
9 <B> AND CALC_YTD = '-2109490.8';</B>
COUNT(*)
--------
8
1 row selected.
Elapsed: 00:00:01.80
</FONT></PRE>
<H5>ANALYSIS:</H5>
<P>Notice the difference in elapsed time. Simply changing the order of conditions
according to the given table statistics, the second query ran almost 14 seconds faster
than the first one. Imagine the difference on a poorly structured query that runs
for three hours!
<H3><FONT COLOR="#000077"><B>Procedures</B></FONT></H3>
<P>For queries that are executed on a regular basis, try to use procedures. A <I>procedure</I>
is a potentially large group of SQL statements. (Refer to Day 13, "Advanced
SQL Topics.")</P>
<P>Procedures are compiled by the database engine and then executed. Unlike an SQL
statement, the database engine need not optimize the procedure before it is executed.
Procedures, as opposed to numerous individual queries, may be easier for the user
to maintain and more efficient for the database.
<H3><FONT COLOR="#000077"><B>Avoiding </B>OR</FONT></H3>
<P>Avoid using the logical operator <TT>OR</TT> in a query if possible. <TT>OR</TT>
inevitably slows down nearly any query against a table of substantial size. We find
that <TT>IN</TT> is generally much quicker than <TT>OR</TT>. This advice certainly
doesn't agree with documentation stating that optimizers convert <TT>IN</TT> arguments
to <TT>OR</TT> conditions. Nevertheless, here is an example of a query using multiple
<TT>OR</TT>s:
<H5>INPUT:</H5>
<PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B>
2 <B>FROM FACT_TABLE</B>
3 <B> WHERE STATUS_CD = 'A'</B>
4 <B> OR STATUS_CD = 'B'</B>
5 <B> OR STATUS_CD = 'C'</B>
6 <B> OR STATUS_CD = 'D'</B>
7 <B> OR STATUS_CD = 'E'</B>
8 <B> OR STATUS_CD = 'F'</B>
9 <B>ORDER BY STATUS_CD;</B>
</FONT></PRE>
<P>Here is the same query using <TT>SUBSTR</TT> and <TT>IN</TT>:
<H5>INPUT:</H5>
<PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B>
2 <B>FROM FACT_TABLE</B>
3 <B>WHERE STATUS_CD IN ('A','B','C','D','E','F')</B>
4 <B> ORDER BY STATUS_CD;</B>
</FONT></PRE>
<H5>ANALYSIS:</H5>
<P>Try testing something similar for yourself. Although books are excellent sources
for standards and direction, you will find it is often useful to come to your own
conclusions on certain things, such as performance.</P>
<P>Here is another example using <TT>SUBSTR</TT> and <TT>IN.</TT> Notice that the
first query combines <TT>LIKE</TT> with <TT>OR</TT>.</P>
<H5>INPUT:</H5>
<PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B>
2 <B>FROM FACT_TABLE</B>
3 <B>WHERE PROD_CD LIKE 'AB%'</B>
4 <B> OR PROD_CD LIKE 'AC%'</B>
5 <B> OR PROD_CD LIKE 'BB%'</B>
6 <B> OR PROD_CD LIKE 'BC%'</B>
7 <B> OR PROD_CD LIKE 'CC%'</B>
8 <B>ORDER BY PROD_CD;</B>
</FONT></PRE>
<PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B>
2 <B>FROM FACT_TABLE</B>
3 <B>WHERE SUBSTR(PROD_CD,1,2) IN ('AB','AC','BB','BC','CC')</B>
4 <B>ORDER BY PROD_CD;</B>
</FONT></PRE>
<H5>ANALYSIS:</H5>
<P>The second example not only avoids the <TT>OR</TT> but also eliminates the combination
of the <TT>OR</TT> and <TT>LIKE</TT> operators. You may want to try this example
to see what the real-time performance difference is for your data.
<H2><FONT COLOR="#000077"><B>OLAP Versus OLTP</B></FONT></H2>
<P>When tuning a database, you must first determine what the database is being used
for. An online analytical processing (OLAP) database is a system whose function is
to provide query capabilities to the end user for statistical and general informational
purposes. The data retrieved in this type of environment is often used for statistical
reports that aid in the corporate decision-making process. These types of systems
are also referred to as decision support systems (DSS). An online transactional processing
(OLTP) database is a system whose main function is to provide an environment for
end-user input and may also involve queries against day-to-day information. OLTP
systems are used to manipulate information within the database on a daily basis.
Data warehouses and DSSs get their data from online transactional databases and sometimes
from other OLAP systems.
<H3><FONT COLOR="#000077"><B>OLTP Tuning</B></FONT></H3>
<P>A transactional database is a delicate system that is heavily accessed in the
form of transactions and queries against day-to-day information. However, an OLTP
does not usually require a vast sort area, at least not to the extent to which it
is required in an OLAP environment. Most OLTP transactions are quick and do not involve
much sorting.</P>
<P>One of the biggest issues in a transactional database is rollback segments. The
amount and size of rollback segments heavily depend on how many users are concurrently
accessing the database, as well as the amount of work in each transaction. The best
approach is to have several rollback segments in a transactional environment.</P>
<P>Another concern in a transactional environment is the integrity of the <I>transaction
logs, </I>which are written to after each transaction. These logs exist for the sole
purpose of recovery. Therefore, each SQL implementation needs a way to back up the
logs for use in a "point in time recovery." SQL Server uses dump devices;
Oracle uses a database mode known as ARCHIVELOG mode. Transaction logs also involve
a performance consideration because backing up logs requires additional overhead.
<H3><FONT COLOR="#000077"><B>OLAP Tuning</B></FONT></H3>
<P>Tuning OLAP systems, such as a data warehouse or decision support system, is much
different from tuning a transaction database. Normally, more space is needed for
sorting.</P>
<P>Because the purpose of this type of system is to retrieve useful decision-making
data, you can expect many complex queries, which normally involve grouping and sorting
of data. Compared to a transactional database, OLAP systems typically take more space
for the sort area but less space for the rollback area.</P>
<P>Most transactions in an OLAP system take place as part of a batch process. Instead
of having several rollback areas for user input, you may resort to one large rollback
area for the loads, which can be taken offline during daily activity to reduce overhead.
<H2><FONT COLOR="#000077"><B>Batch Loads Versus Transactional Processing</B></FONT></H2>
<P>A major factor in the performance of a database and SQL statements is the type
of processing that takes place within a database. One type of processing is OLTP,
discussed earlier today. When we talk about transactional processing, we are going
to refer to two types: user input and batch loads.</P>
<P>Regular user input usually consists of SQL statements such as <TT>INSERT</TT>,
<TT>UPDATE</TT>, and <TT>DELETE</TT>. These types of transactions are often performed
by the end user, or the customer. End users are normally using a front-end application
such as PowerBuilder to interface with the database, and therefore they seldom issue
visible SQL statements. Nevertheless, the SQL code has already been generated for
the user by the front-end application.</P>
<P>Your main focus when optimizing the performance of a database should be the end-user
transactions. After all, "no customer" equates to "no database,"
which in turn means that you are out of a job. Always try to keep your customers
happy, even though their expectations of system/database performance may sometimes
be unreasonable. One consideration with end-user input is the number of concurrent
users. The more concurrent database users you have, the greater the possibilities
of performance degradation.</P>
<P>What is a batch load? A <I>batch load</I> performs heaps of transactions against
the database at once. For example, suppose you are archiving last year's data into
a massive history table. You may need to insert thousands, or even millions, of rows
of data into your history table. You probably wouldn't want to do this task manually,
so you are likely to create a batch job or script to automate the process. (Numerous
techniques are available for loading data in a batch.) Batch loads are notorious
for taxing system and database resources. These database resources may include table
access, system catalog access, the database rollback segment, and sort area space;
system resources may include available CPU and shared memory. Many other factors
are involved, depending on your operating system and database server.</P>
<P>Both end-user transactions and batch loads are necessary for most databases to
be successful, but your system could experience serious performance problems if these
two types of processing lock horns. Therefore, you should know the difference between
them and keep them segregated as much as possible. For example, you would not want
to load massive amounts of data into the database when user activity is high. The
database response may already be slow because of the number of concurrent users.
Always try to run batch loads when user activity is at a minimum. Many shops reserve
times in the evenings or early morning to load data in batch to avoid interfering
with daily processing.</P>
<P>You should always plan the timing for massive batch loads, being careful to avoid
scheduling them when the database is expected to be available for normal use. Figure
15.1 depicts heavy batch updates running concurrently with several user processes,
all contending for system resources.</P>
<P><A NAME="01"></A><A HREF="01-4.htm" tppabs="http://202.113.16.101/%7eeb%7e/Teach%20Yourself%20MS%20SQL%20Server%206.5%20in%2021%20Days/01-4.htm"><B>Figure 15.1.</B></A><B><BR>
</B><I>System resource contention.</I></P>
<P>As you can see, many processes are contending for system resources. The heavy
batch updates that are being done throw a monkey wrench into the equation. Instead
of the system resources being dispersed somewhat evenly among the users, the batch
updates appear to be hogging them. This situation is just the beginning of resource
contention. As the batch transactions proceed, the user processes may eventually
be forced out of the picture. This condition is not a good way of doing business.
Even if the system has only one user, significant contention for that user could
occur.</P>
<P>Another problem with batch processes is that the process may hold locks on a table
that a user is trying to access. If there is a lock on a table, the user will be
refused access until the lock is freed by the batch process, which could be hours.
Batch processes should take place when system resources are at their best if possible.
Don't make the users' transactions compete with batch. Nobody wins that game.
<H2><FONT COLOR="#000077"><B>Optimizing Data Loads by Dropping Indexes</B></FONT></H2>
<P>One way to expedite batch updates is by dropping indexes. Imagine the history
table with many thousands of rows. That history table is also likely to have one
or more indexes. When you think of an index, you normally think of faster table access,
but in the case of batch loads, you can benefit by dropping the index(es).</P>
<P>When you load data into a table with an index, you can usually expect a great
deal of index use, especially if you are updating a high percentage of rows in the
table. Look at it this way. If you are studying a book and highlighting key points
for future reference, you may find it quicker to browse through the book from beginning
to end rather than using the index to locate your key points. (Using the index would
be efficient if you were highlighting only a small portion of the book.)</P>
<P>To maximize the efficiency of batch loads/updates that affect a high percentage
of rows in a table, you can take these three basic steps to disable an index:
<DL>
<DD><B>1. </B>Drop the appropriate index(es).<BR>
<BR>
<B>2. </B>Load/update the table's data.<BR>
<BR>
<B>3.</B> Rebuild the table's index.
</DL>
<H2><FONT COLOR="#000077"><B>A Frequent </B><TT>COMMIT</TT><B> Keeps the DBA Away</B></FONT></H2>
<P>When performing batch transactions, you must know how often to perform a "commit."
As you learned on Day 11, "Controlling Transactions," a <TT>COMMIT</TT>
statement finalizes a transaction. A <TT>COMMIT</TT> saves a transaction or writes
any changes to the applicable table(s). Behind the scenes, however, much more is
going on. Some areas in the database are reserved to store completed transactions
before the changes are actually written to the target table. Oracle calls these areas
<I>rollback segments</I>. When you issue a <TT>COMMIT</TT> statement, transactions
associated with your SQL session in the rollback segment are updated in the target
table. After the update takes place, the contents of the rollback segment are removed.
A <TT>ROLLBACK</TT> command, on the other hand, clears the contents of the rollback
segment without updating the target table.</P>
<P>As you can guess, if you never issue a <TT>COMMIT</TT> or <TT>ROLLBACK</TT> command,
transactions keep building within the rollback segments. Subsequently, if the data
you are loading is greater in size than the available space in the rollback segments,
the database will essentially come to a halt and ban further transactional activity.
Not issuing <TT>COMMIT</TT> commands is a common programming pitfall; regular <TT>COMMIT</TT>s
help to ensure stable performance of the entire database system.</P>
<P>The management of rollback segments is a complex and vital database administrator
(DBA) responsibility because transactions dynamically affect the rollback segments,
and in turn, affect the overall performance of the database as well as individual
SQL statements. So when you are loading large amounts of data, be sure to issue the
<TT>COMMIT</TT> command on a regular basis. Check with your DBA for advice on how
often to commit during batch transactions. (See Figure 15.2.)</P>
<P><A NAME="02"></A><A HREF="02-3.htm" tppabs="http://202.113.16.101/%7eeb%7e/Teach%20Yourself%20MS%20SQL%20Server%206.5%20in%2021%20Days/02-3.htm"><B>Figure 15.2.</B></A><B><BR>
</B><I>The rollback area.</I></P>
<P>As you can see in Figure 15.2, when a user performs a transaction, the changes
are retained in the rollback area.
<H2><FONT COLOR="#000077"><B>Rebuilding Tables and Indexes in a Dynamic Environment</B></FONT></H2>
<P>The term <I>dynamic database environment </I>refers to a large database that is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -