📄 lib0124.html
字号:
<html>
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<head>
<title>Diagnosing Performance Problems</title>
<link rel="STYLESHEET" type="text/css" href="images/xpolecat.css">
<link rel="STYLESHEET" type="text/css" href="images/ie.content.css">
</head>
<body>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr><td><div STYLE="MARGIN-LEFT: 0.15in;"><a href="toc.html"><img src="images/teamlib.gif" width="62" height="15" border="0" align="absmiddle" alt="Team LiB"></a></div></td>
<td align="right"><div STYLE="MARGIN-LEFT: 0.15in;">
<a href="LiB0123.html"><img src="images/previous.gif" width="62" height="15" border="0" align="absmiddle" alt="Previous Section"></a>
<a href="LiB0125.html"><img src="images/next.gif" width="41" height="15" border="0" align="absmiddle" alt="Next Section"></a>
</div></td></tr></table>
<br>
<div class="chapter">
<a name="ch19"></a>
<div class="section">
<h2 class="first-section-title"><a name="609"></a><a name="ch19lev1sec2"></a>Diagnosing Performance Problems</h2><p class="first-para">
<b class="bold">Use profiler tools to diagnose performance problems.</b> A profiler reports the activity of a JVM at a configurable interval (typically, every five milliseconds) and reports the call stack in use for every thread. The methods taking the most time will most likely show up in more observations and provide leads as to where you should tune.</p>
<p class="para">Some J2EE containers use multiple JVMs, making it difficult to profile the entire container. Instead, you'll want to directly profile test cases that use the underlying business objects. You'll skip profiling the deployment <a name="610"></a><a name="IDX-257"></a>and presentation layers in their entirety, but performance tuning is most likely to be at the business logic layer or lower anyway. In a layered architecture, the deployment and presentation layers don't perform much of the processing. If your container only uses one JVM, you can profile the entire container and run your JMeter test script against it with a small load.</p>
<p class="para">
<b class="bold">Do not attempt to profile in a clustered environment.</b> For those of you in a clustered architecture, I recommend profiling in one instance only, not in clustered mode. Your goal is to tune your application, not wade through the work your container does to implement clustering.</p>
<p class="para">Profilers tell you where (in which class or method) CPU time is being spent and where (in which classes) memory is being allocated. The default profiler that comes with the JVM (HPROF) produces output that's not intuitive and is hard to read, but that output contains much of the same information as commercial profilers. The advantage of commercial profilers is that they make performance information easier to read and interpret. If your organization has a license for a commercial profiler, use it instead of HPROF.</p>
<p class="para">If you don't have access to a commercial profiler, you'll probably have to spend a few more minutes interpreting the output than your colleagues with commercial profilers. In the <a href="LiB0125.html#625" target="_parent" class="chapterjump">next section</a>, I provide a usage cheat sheet for HPROF. The default profiler measures both CPU time and memory allocation, but I recommend measuring these separately to avoid contaminating the test. Methods for debugging memory leaks are also included in the <a href="LiB0125.html#625" target="_parent" class="chapterjump">next section</a>.</p>
<div class="section">
<h3 class="sect3-title">
<a name="611"></a><a name="ch19lev2sec3"></a>Using HPROF to Measure CPU Usage</h3>
<p class="first-para">HPROF is invoked by including arguments when the JVM is started. To measure CPU usage, include the following arguments in the Java command invocation:</p>
<div class="informalexample">
<pre class="literallayout">
-Xrunhprof:cpu=samples,thread=y,file=cpu.hprof.txt,depth=32
</pre>
</div>
<p class="para">HPROF will place results in the file listed in the <span class="fixed">file</span> argument. The <span class="fixed">cpu</span> argument indicates that HPROF will measure CPU consumption, not memory. The <span class="fixed">thread</span> argument tells you that HPROF will indicate the thread in the stack trace details. And the <span class="fixed">depth</span> argument indicates how many levels or calls to record in the stack trace details.</p>
<p class="para">The JVM should be shut down cleanly for HPROF to have an opportunity to record its observations. Don't be surprised if HPROF takes a few minutes to record its data. Likewise, don't be alarmed at the sluggishness of your application while HPROF is running; that's to be expected.</p>
<a name="612"></a><a name="IDX-258"></a>
<p class="para">With the CPU options turned on, HPROF produces a file. The first place to look is in the last part of the file, in a section detailing CPU stack trace rankings. This section provides a call stack ID and the percentage of the time the call stack was invoked. HPROF works by recording the call stacks for each thread every five minutes. Odds are high that HPROF will most frequently observe the places where your application is spending the most time. <a class="internaljump" href="#ch19list02">Listing 19.2</a> illustrates the CPU stack trace rankings produced by HPROF.</p>
<div class="example">
<span class="example-title"><span class="example-titlelabel">Listing 19.2: </span>Sample CPU Stack Trace Rankings</span><a name="613"></a><a name="ch19list02"></a>
<div class="formalbody">
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="Start example" border="0"></b></font></td>
</tr>
</table>
<pre class="literallayout">
CPU SAMPLES BEGIN (total = 52) Sun Jul 13 10:48:18 2003
rank self accum count trace method
1 30.77% 30.77% 16 6 java.lang.StringBuffer.<init>
2 25.00% 55.77% 13 5 java.lang.StringBuffer.<init>
3 9.62% 65.38% 5 11 java.lang.Class.getName
4 7.69% 73.08% 4 7 java.lang.Class.isAssignableFrom
5 7.69% 80.77% 4 8 java.lang.Class.getName
6 5.77% 86.54% 3 10 java.lang.Class.isAssignableFrom
7 3.85% 90.38% 2 13 java.lang.reflect.Field.copy
8 1.92% 92.31% 1 12 java.lang.Class.isAssignableFrom
9 1.92% 94.23% 1 4 java.io.FileOutputStream.writeBytes
10 1.92% 96.15% 1 1 sun.misc.URLClassPath$3.run
11 1.92% 98.08% 1 9 java.lang.Class.getName
12 1.92% 100.00% 1 14 java.lang.Class.copyFields
CPU SAMPLES END
</pre>
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="End example" border="0"></b></font></td>
</tr>
</table>
<table class="BlankSpace" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td height="16"></td>
</tr>
</table>
</div>
</div>
<p class="para">Once you have the call stack ID, you can get details of what's in that stack in the preceding section of the HPROF-produced file. <a class="internaljump" href="#ch19list03">Listing 19.3</a> shows the stack corresponding to trace 6, which accounted for 30.77% of the CPU time.</p>
<div class="example">
<span class="example-title"><span class="example-titlelabel">Listing 19.3: </span>Stack Trace Details</span><a name="614"></a><a name="ch19list03"></a>
<div class="formalbody">
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="Start example" border="0"></b></font></td>
</tr>
</table>
<pre class="literallayout">
TRACE 6: (thread=3)
java.lang.StringBuffer.<init>(StringBuffer.java:115)
org.cementj.base.ValueObject.getConcantonatedObjectValue(ValueObject.java:219)
org.cementj.base.ValueObject.equals(ValueObject.java:49)
book.sample.dto.cementj.TestCustomerDTO.main(TestCustomerDTO.java:28)
</pre>
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="End example" border="0"></b></font></td>
</tr>
</table>
<table class="BlankSpace" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td height="16"></td>
</tr>
</table>
</div>
</div>
<p class="para">The stack trace description will tell you what class or method in your code is using up the time. You want to look at the first class in the trace that <a name="615"></a><a name="IDX-259"></a>is a part of your application (the bold line in <a class="internaljump" href="#ch19list03">listing 19.3</a>). This is where you need look for tuning opportunities.</p>
<p class="para">
<a class="internaljump" href="#ch19list04">Listing 19.4</a> is the section of code highlighted in the trace. Lines 3 and 4 in the listing are taking the largest amount of CPU time. The only piece of this that can be tuned is the initial size of the buffer, given by _<span class="fixed">startBufferSize</span>. A higher number means that it will take longer to instantiate the buffer, but the <span class="fixed">append()</span> operations later in this method won't take as long because the memory is already allocated.</p>
<div class="example">
<span class="example-title"><span class="example-titlelabel">Listing 19.4: </span>Extract from CementJ</span><a name="616"></a><a name="ch19list04"></a>
<div class="formalbody">
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="Start example" border="0"></b></font></td>
</tr>
</table>
<pre class="literallayout">
1: private String getConcantonatedObjectValue()
2: {
3: StringBuffer buffer =
4: new StringBuffer(_startBufferSize);
5: Object tempObj = null;
6: Object[] tempArray = null;
7: // Some code omitted.
8: }
</pre>
<table class="BlueLine" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td bgcolor="000080" class="bluecell"><font size="2" face="Arial" color="010100"><b><img src="_.gif" width="1" height="2" alt="End example" border="0"></b></font></td>
</tr>
</table>
<table class="BlankSpace" border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<td height="16"></td>
</tr>
</table>
</div>
</div>
<p class="para">I've found that taking more time in the initial allocation is a better practice than causing the <span class="fixed">append()</span> to reallocate larger and larger chunks of memory. After validating that the <span class="fixed">_startBufferSize</span> is being estimated appropriately and isn't much larger than the memory needed, there isn't any way to tune this code. I would move on to the other hot spots listed in the trace.</p>
<p class="para">Entire books have been written about coding for better performance. The first book I consult is <a href="LiB0125.html#626" target="_parent" class="chapterjump">Bulka (2000)</a>, which has a wide range of code-level tuning suggestions that are supported by performance test data. I can't recommend this book highly enough.</p>
<p class="para">
<b class="bold">Look only at stacks using 5 percent or more of the CPU.</b> The rest is too small to worry about. Suppose you're able to tune a method only using 1 percent of your CPU. Suppose you get a 20 percent performance improvement for that method. Since it only uses 1 percent of the CPU anyway, your effort will improve performance by just 0.2 percent overall—usually not considered a material improvement. The corollary to this suggestion is that if all stacks are using less than 5 percent of your CPU, you can stop tuning.</p>
<p class="last-para">If you find that most of your time is being spent in JDBC, you should enlist the aid of a database administrator and tune your database and/or application SQL. It's entirely possible that the stack trace indicates a specific method in a DAO. That usually will limit the SQL being executed to one or two statements. You can then execute these statements via an online query <a name="617"></a><a name="IDX-260"></a>tool to diagnose query performance. Within the online query tool, you can try out alternative ways of writing the query to get your sample working faster.</p>
</div>
<div class="section">
<h3 class="sect3-title">
<a name="618"></a><a name="ch19lev2sec4"></a>Using HPROF to Measure Memory Usage</h3>
<p class="first-para">To have HPROF measure memory usage, invoke the JVM with the following arguments:</p>
<div class="informalexample">
<pre class="literallayout">
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -