📄 ch19.htm
字号:
<P>The total number of query servers in use is the number of queries executed in
parallel multiplied by their degree of parallelism. If you try to use more than <TT>PARALLEL_MAX_SERVERS</TT>,
you will not be able to parallelize your query.
<H3><FONT COLOR="#000077"><B>Direct-Write Sorts</B></FONT></H3>
<P>You can use the Direct Write Sort option with the Parallel Query option and have
each query server perform its own direct writes.</P>
<P>As you saw earlier, using direct writes causes the server processes to write the
output of sort operations directly to disk, bypassing the buffer cache. The effect
of direct writes is that for sort operations, large amounts of block buffers are
not ejected from the buffer cache. This leaves the buffer cache available for normal
queries and updates. When you use direct-write sorts with the Parallel Query option,
each query server gets its own set of direct-write buffers.</P>
<P>Remember, direct-write sorts require more memory than normal sorts. The amount
of memory that these sorts use with the Parallel Query option can be determined with
the following formula:
<DL>
<DD>Direct write sort memory = (number of query servers) * <TT>SORT_WRITE_BUFFERS</TT>
* <TT>SORT_WRITE_BUFFER_SIZE</TT>
</DL>
<P>Use direct-write sorts only if you have sufficient memory and temporary disk space.
The temporary disk space should have a sufficient I/O bandwidth to handle the load.
<H3><FONT COLOR="#000077"><B>Parallel Index Creation</B></FONT></H3>
<P>Another feature of the Parallel Query option is its ability to create indexes
in parallel. With the parallel index creation feature, the time it takes to create
an index can be greatly reduced.</P>
<P>As with parallel query processing, a coordinator process dispatches two sets of
query servers. One set of query servers scans the table to be indexed to obtain the
<TT>ROWID</TT>s and column values needed for the index. Another set performs the
sorting on those values and passes the results to the coordinator process. The coordinator
process then assembles the B<SUP>*</SUP>-tree index from these sorted items.</P>
<P>When creating an index, the degree of parallelism follows the same precedence
as it does in parallel query processing. The first value used is an optional <TT>PARALLEL</TT>
clause in the <TT>CREATE INDEX</TT> statement, followed by the table definition,
and finally the initialization parameters.</P>
<P>Creating an index in parallel can be several times faster than creating an index
by normal means. The same conditions apply for index creation as were given for parallel
query processing. A system that has been configured to take advantage of parallel
query processing will also see good performance from parallel index creation.
<H3><FONT COLOR="#000077"><B>Parallel Loading</B></FONT></H3>
<P>You can load in parallel by having multiple concurrent sessions perform a direct
path load into the same table. Depending on the configuration of the system, you
can see excellent load performance by loading in parallel. Because loading is both
CPU and I/O intensive, you should see good results in an SMP or MPP environment with
a high-bandwidth I/O subsystem.</P>
<P>Parallel loads are performed by multiple direct loader processes, each using the
<TT>PARALLEL=TRUE</TT> and <TT>DIRECT=TRUE</TT> options. When you specify <TT>PARALLEL=TRUE</TT>,
the loader does not place an exclusive lock on the table being loaded as it would
otherwise. During the parallel load, the loader creates temporary segments for each
of the concurrent processes and merges them on completion.</P>
<P>Although parallel loading performs best when each temporary file is located on
a separate disk, the increased performance of the load does not usually justify the
complexity of the manual striping needed to do this. I still recommend striping the
tables on an OS level--or preferably on a hardware disk. You can improve performance
by putting each of the input files on a separate volume to take advantage of the
sequential nature of the reads.</P>
<P>Parallel loading can be beneficial, especially if load time is critical in your
environment. By putting each of the input files on separate disk volumes, you can
increase performance. Overall, the general tuning principles used in parallel query
processing are also valid in parallel loading.
<H3><FONT COLOR="#000077"><B>Parallel Recovery</B></FONT></H3>
<P>Parallel recovery is probably my favorite feature of the Parallel Query option.
When benchmarking Oracle and testing hardware and software, it is often necessary
to intentionally crash the system to prove recoverability. With the Parallel Recovery
option, the time it takes to perform an instance recovery can be dramatically reduced.</P>
<P>Recovery time is significantly reduced when the system being recovered has many
disks and supports asynchronous I/O. For a small system that has few drives or for
an operating system that does not support asynchronous I/O, it might not be wise
to enable parallel recovery.</P>
<P>In traditional recovery, one process reads from the redo log files and applies
changes to the datafiles, as shown in Figure 19.4. This operation can take a significant
amount of time because the recovery process must wait for disk I/Os to complete.</P>
<P><A NAME="04"></A><A HREF="04.htm"><B>Figure 19.4.</B></A></P>
<P><I>Traditional instance recovery without parallel query.</I></P>
<P>With the Parallel Recovery option, one process is responsible for reading and
dispatching redo entries from the redo log files and passing those entries to the
recovery processes that apply the changes to the datafiles, as shown in Figure 19.5.</P>
<P>Because the dispatcher process reads sequentially from the redo log files, the
I/O performance is much higher than that of the recovery processes that are writing
random data throughout the datafiles. Because writing the data is very seek intensive,
it is a good idea to have one or two recovery processes for each data disk in the
system.</P>
<P>By having more recovery processes, you can have more outstanding I/Os and thus
use all the data drives simultaneously. Because recovery occurs at instance startup,
this arrangement reduces dead time when no other database processing can be performed.</P>
<P><A NAME="05"></A><A HREF="05.htm"><B>Figure 19.5.</B></A></P>
<P><I>Instance recovery with parallel recovery.</I></P>
<P>The number of concurrent recovery processes is set with the initialization parameter
<TT>RECOVERY_PARALLEL</TT>. The value of this parameter cannot exceed the value specified
in the initialization parameter <TT>PARALLEL_MAX_SERVERS</TT>.</P>
<P>If you specify a sufficient number of recovery servers, you will see an immediate
improvement in instance recovery time. Do not use parallel recovery if your system
does not support asynchronous I/O or if you are limited to a few disk drives. If
your I/O subsystem is high bandwidth and your data is properly striped (either through
software or hardware), you should see very good improvement.</P>
<P>In summary, the Parallel Query option is useful for distributing processing loads
so that CPUs process while other processes wait for I/Os to complete. With multiprocessor
machines, the Parallel Query option can be quite beneficial; this is not to say that
the option is not beneficial on uniprocessor machines as well.
<BLOCKQUOTE>
<P>
<HR>
<FONT COLOR="#000077"><B>NOTE:</B></FONT><B> </B>Probably the biggest performance
problem I have come across is a lack of disk drives. As larger and larger disks are
produced at lower and lower prices, many installations end up with I/O problems because
a few large disks can provide the space of many smaller disks. The larger disks provide
more disk space but not much more performance (per disk). The Parallel Query option
can help only in systems where I/O is not a bottleneck. When I/O is not a problem,
you will see significant gains from parallel queries.<BR>
<BR>
If you have processes waiting for queries to complete and a sufficient number of
disk drives, you will see an improvement with parallel queries, regardless of whether
you are on a multiprocessor or uniprocessor system.
<HR>
</BLOCKQUOTE>
<H2><FONT COLOR="#000077"><B>Summary</B></FONT></H2>
<P>Today you learned about several features available to Oracle, including the Oracle
Parallel Query and Oracle Parallel Server options. These features are quite different
from each other and serve completely different purposes.</P>
<P>The Oracle Parallel Server option (OPS), a clustering option, can be used as a
performance option and as a fault-tolerant option. Because multiple systems can access
the same database, a significant performance enhancement can be seen in some situations.
<BLOCKQUOTE>
<P>
<HR>
<FONT COLOR="#000077"><B>NOTE:</B></FONT><B> </B>Under some conditions, the Oracle
Parallel Server option can provide a significant performance increase. The performance
improvement you see is very dependent on your application and data layout. Without
careful planning, you might not see any performance improvement, and you might even
see performance degradation.
<HR>
</BLOCKQUOTE>
<P>The Oracle Parallel Query option allows certain SQL operations to be parallelized,
thus reducing idle time waiting for I/Os to complete. If you take advantage of the
Parallel Query option, you can see very good performance enhancements under the right
conditions. The Parallel Query option is probably my favorite feature of the Oracle
RDBMS because such incredible results can be seen from the right parallelism.
<H2><FONT COLOR="#000077"><B>What's Next?</B></FONT></H2>
<P>In tomorrow's lesson, "Effectively Tuning and Optimizing the Database,"
you will learn to tune the Oracle RDBMS, and how to use OS and RDBMS information
to make changes that will improve your system's performance. You will learn how to
use the Oracle administration scripts <TT>UTLBSTAT.SQL</TT> and <TT>UTLESTAT.SQL</TT>
to obtain Oracle performance information, and you will learn how to interpret some
of that information.
<H2><FONT COLOR="#000077"><B>Q&A</B></FONT></H2>
<DL>
<DD><B>Q What is the Oracle Parallel Server option?</B>
<P><B>A</B> The Oracle Parallel Server option is a feature of Oracle that allows
you to cluster two or more systems sharing the same database.</P>
<P><B>Q Does the Parallel Server option require any special hardware?</B></P>
<P><B>A</B> Yes. The Parallel Server option requires a server interconnect that might
be as simple as a fast network card and a shared-disk subsystem. The shared-disk
subsystem is provided by your hardware vendor.</P>
<P><B>Q What is the Parallel Query option?</B></P>
<P><B>A</B> The Parallel Query option allows certain SQL operations to be parallelized.
If you parallelize the operation, time spent waiting for I/O by one thread of operation
can be spent processing another thread of operation.</P>
<P><B>Q Do single-processor systems benefit from the Parallel Query option?</B></P>
<P><B>A</B> Definitely. Because the greatest benefit of the Parallel Query option
is to keep the system busy while it is waiting for I/Os to complete, this will work
well even with a single-CPU system.
</DL>
<H2><FONT COLOR="#000077"><B>Workshop</B></FONT></H2>
<P>The workshop provides quiz questions to help you solidify your understanding of
the material covered and exercises to provide you with experience in using what you've
learned. Find answers to the quiz questions in Appendix A, "Answers."
<H3><FONT COLOR="#000077"><B>Quiz</B></FONT></H3>
<DL>
<DD><B>1.</B> What is the Oracle Parallel Server option used for?
<P><B>2.</B> What is the server interconnect for?</P>
<P><B>3. </B>What is a shared-disk subsystem?</P>
<P><B>4.</B> What does DLM stand for?</P>
<P><B>5.</B> What does PCM stand for?</P>
<P><B>6. </B>What is the Parallel Query option used for?</P>
<P><B>7.</B> How does the Parallel Query option improve performance?</P>
<P><B>8.</B> Name the primary way to set the degree of parallelism.</P>
<P><B>9.</B> Name the secondary way to set the degree of parallelism.</P>
<P><B>10.</B> Name some operations that can be parallelized.
</DL>
<H3><FONT COLOR="#000077"><B>Exercises</B></FONT></H3>
<DL>
<DD><B>1.</B> Run a query using traditional methods while timing it.
<P><B>2. </B>Run the same query using a parallel degree of <TT>2</TT> set in a hint.
Compare the time.</P>
<P><B>3. </B>Run the same query using a parallel degree of <TT>4</TT> set in a hint.
Compare the time.</P>
<P><B>4. </B>Run the same query using a parallel degree of <TT>8</TT> set in a hint.
Compare the time.
</DL>
<CENTER>
<P>
<HR>
<A HREF="../ch18/ch18.htm"><IMG SRC="../buttonart/previous.gif" WIDTH="128" HEIGHT="28"
ALIGN="BOTTOM" ALT="Previous chapter" BORDER="0"></A><A HREF="../ch20/ch20.htm"><IMG
SRC="../buttonart/next.gif" WIDTH="128" HEIGHT="28" ALIGN="BOTTOM" ALT="Next chapter"
BORDER="0"></A><A HREF="../index.htm"><IMG SRC="../buttonart/contents.gif" WIDTH="128"
HEIGHT="28" ALIGN="BOTTOM" ALT="Contents" BORDER="0"></A> <BR>
<BR>
<BR>
<IMG SRC="../buttonart/corp.gif" WIDTH="284" HEIGHT="45" ALIGN="BOTTOM" ALT="Macmillan Computer Publishing USA"
BORDER="0"></P>
<P>© <A HREF="../copy.htm">Copyright</A>, Macmillan Computer Publishing. All
rights reserved.
</CENTER>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -