📄 chap06.htm
字号:
<h2>Expected value of a random variable</h2><P>
<a name="074c_12d0"><a name="074c_12d1">The simplest and most useful summary of the distribution of a random variable is the "average" of the values it takes on. The <I><B>expected</I></B> <B>value</B> (or, synonymously, <I><B>expectation</I></B> or <I><B>mean</I></B>) of a discrete random variable <I>X</I> is<P>
<img src="112_b.gif"><P>
<h4><a name="074c_12d3">(6.23)<a name="074c_12d3"></sub></sup></h4><P>
which is well defined if the sum is finite or converges absolutely. Sometimes the expectation of <I>X</I> is denoted by <img src="../images/mu12.gif"><I>x</I> or, when the random variable is apparent from context, simply by <img src="../images/mu12.gif"><I>.</I><P>
Consider a game in which you flip two fair coins. You earn $3 for each head but lose $2 for each tail. The expected value of the random variable <I>X</I> representing your earnings is<P>
<pre>E[<I>X</I>] = 6 <img src="../images/dot10.gif"> Pr{2 H'S} + 1 <img src="../images/dot10.gif"> Pr{1 H, 1 T} - 4 <img src="../images/dot10.gif"> Pr{2 T'S}</sub></sup></pre><P>
<pre>= 6(1/4)+ 1(1/2) - 4(1/4)</sub></sup></pre><P>
<pre>= 1 .</sub></sup></pre><P>
The expectation of the sum of two random variables is the sum of their expectations, that is,<P>
<pre>E[<I>X</I> + <I>Y</I>] = E[<I>X</I>] + E[<I>Y</I>] ,</sub></sup></pre><P>
<h4><a name="074c_12d4">(6.24)<a name="074c_12d4"></sub></sup></h4><P>
whenever E[<I>X</I>] and E[<I>Y</I>] are defined. This property extends to finite and absolutely convergent summations of expectations.<P>
If <I>X</I> is any random variable, any function <I>g</I>(<I>x</I>) defines a new random variable <I>g</I>(<I>X</I>). If the expectation of <I>g</I>(<I>X</I>) is defined, then<P>
<img src="112_c.gif"><P>
Letting <I>g</I>(<I>x</I>) = <I>ax</I>, we have for any constant <I>a</I>,<P>
<pre>E[<I>aX</I>] = <I>a</I>E[<I>X</I>] .</sub></sup></pre><P>
<h4><a name="074c_12d5">(6.25)<a name="074c_12d5"></sub></sup></h4><P>
<a name="074c_12d2">Consequently, expectations are linear: for any two random variables <I>X </I>and <I>Y</I> and any constant <I>a</I>,<P>
<pre>E[<I>aX</I> + <I>Y</I>] =<I> a</I>E[<I>X</I>] + E[<I>Y</I>] .</sub></sup></pre><P>
<h4><a name="074c_12d6">(6.26)<a name="074c_12d6"></sub></sup></h4><P>
When two random variables <I>X</I> and <I>Y</I> are independent and each has a defined expectation,<P>
<img src="113_a.gif"><P>
In general, when <I>n</I> random variables <I>X</I><SUB>1</SUB>, <I>X</I><SUB>2</SUB>, . . . , <I>X<SUB>n</I></SUB> are mutually independent,<P>
<pre>E[<I>X</I><SUB>1</SUB><I>X</I><SUB>2 </SUB><img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <I>X<SUB>n</I></SUB>]<I> </I>= E[<I>X</I><SUB>1</SUB>]E[<I>X</I><SUB>2</SUB>] <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> E[<I>X<SUB>n</I></SUB>] .</sub></sup></pre><P>
<h4><a name="074c_12d7">(6.27)<a name="074c_12d7"></sub></sup></h4><P>
When a random variable <I>X</I> takes on values from the natural numbers N = {0,1, 2, . . .}, there is a nice formula for its expectation:<P>
<img src="113_b.gif"><P>
<h4><a name="074c_12d8">(6.28)<a name="074c_12d8"></sub></sup></h4><P>
since each term Pr{<I>X</I> <img src="../images/gteq.gif"> <I>i</I>} is added in <I>i</I> times and subtracted out<I> i</I> - 1 times (except Pr{<I>X</I> <img src="../images/gteq.gif"> 0}, which is added in 0 times and not subtracted out at all).<P>
<P>
<h2>Variance and standard deviation</h2><P>
<a name="074d_12d3"><a name="074d_12d4">The <I><B>variance</I></B> of a random variable <I>X</I> with mean E [<I>X</I>] is<P>
<pre>Var[<I>X</I>] = E[(<I>X</I> - E[<I>X</I>])<SUP>2</SUP>]</sub></sup></pre><P>
<pre>= E[<I>X</I><SUP>2</SUP> - 2<I>X</I>E[<I>X</I>] + E<SUP>2</SUP>[<I>X</I>]]</sub></sup></pre><P>
<pre>= E[<I>X</I><SUP>2</SUP>] - 2E[<I>X</I>E[<I>X</I>]] + E<SUP>2</SUP>[<I>X</I>]</sub></sup></pre><P>
<pre>= E[<I>X</I><SUP>2</SUP>] - 2E<SUP>2</SUP>[<I>X]</I> + E<SUP>2</SUP>[<I>X</I>]</sub></sup></pre><P>
<pre>= E[<I>X</I><SUP>2</SUP>] - E<SUP>2</SUP>[<I>X</I>].</sub></sup></pre><P>
<h4><a name="074d_12d6">(6.29)<a name="074d_12d6"></sub></sup></h4><P>
The justification for the equalities E [E<SUP>2</SUP> [<I>X</I>]] = E<SUP>2</SUP> [<I>X</I>] and E [<I>X</I>E [<I>X</I>]] = E<SUP>2</SUP> [<I>X</I>] is that E [<I>X</I>] is not a random variable but simply a real number, which means that equation (6.25) applies (with <I>a</I> = E[<I>X</I>]). Equation (6.29) can be rewritten to obtain an expression for the expectation of the square of a random variable:<P>
<pre>E[<I>X</I><SUP>2</SUP>] = Var[<I>X</I>] + E<SUP>2</SUP>[<I>X</I>].</sub></sup></pre><P>
<h4><a name="074d_12d7">(6.30)<a name="074d_12d7"></sub></sup></h4><P>
The variance of a random variable <I>X</I> and the variance of <I>aX</I> are related:<P>
<pre>Var[<I>aX</I>] = <I>a</I><SUP>2</SUP>Var[<I>X</I>].</sub></sup></pre><P>
When <I>X</I> and <I>Y</I> are independent random variables,<P>
<pre>Var[<I>X </I>+ <I>Y</I>] = Var[<I>X</I>] + Var[<I>Y</I>].</sub></sup></pre><P>
In general, if <I>n</I> random variables <I>X</I><SUB>1</SUB>, <I>X</I><SUB>2</SUB>, . . . , <I>X<SUB>n</I></SUB> are pairwise independent, then<P>
<img src="114_a.gif"><P>
<h4><a name="074d_12d8">(6.31)<a name="074d_12d8"></sub></sup></h4><P>
<a name="074d_12d5">The <I><B>standard deviation</I></B> of a random variable <I>X</I> is the positive square root of the variance of <I>X</I>. The standard deviation of a random variable <I>X</I> is sometimes denoted <img src="../images/sum14.gif"><I><SUB>x</I></SUB> or simply <img src="../images/sum14.gif"> when the random variable <I>X</I> is understood from context. With this notation, the variance of <I>X</I> is denoted <img src="../images/sum14.gif"><SUP>2</SUP>.<P>
<P>
<h2><a name="074e_12d7">Exercises<a name="074e_12d7"></h2><P>
<a name="074e_12d8">6.3-1<a name="074e_12d8"><P>
Two ordinary, 6-sided dice are rolled. What is the expectation of the sum of the two values showing? What is the expectation of the maximum of the two values showing?<P>
<a name="074e_12d9">6.3-2<a name="074e_12d9"><P>
An array A[1 . . <I>n</I>] contains <I>n</I> distinct numbers that are randomly ordered, with each permutation of the <I>n</I> numbers being equally likely. What is the expectation of the index of the maximum element in the array? What is the expectation of the index of the minimum element in the array?<P>
<a name="074e_12da">6.3-3<a name="074e_12da"><P>
A carnival game consists of three dice in a cage. A player can bet a dollar on any of the numbers 1 through 6. The cage is shaken, and the payoff is as follows. If the player's number doesn't appear on any of the dice, he loses his dollar. Otherwise, if his number appears on exactly <I>k</I> of the three dice, for <I>k</I> = 1, 2, 3, he keeps his dollar and wins <I>k</I> more dollars. What is his expected gain from playing the carnival game once?<P>
<a name="074e_12db">6.3-4<a name="074e_12db"><P>
Let <I>X</I> and <I>Y</I> be independent random variables. Prove that <I>f</I>(<I>X</I>) and <I>g</I>(<I>Y</I>) are independent for any choice of functions <I>f</I> and <I>g</I>.<P>
<a name="074e_12dc">6.3-5<a name="074e_12dc"><P>
<a name="074e_12d6">Let <I>X</I> be a nonnegative random variable, and suppose that E (<I>X</I>) is well defined. Prove <I><B>Markov's inequality:</I></B><P>
<pre>Pr{<I>X </I><img src="../images/gteq.gif"> <I>t</I>} <img src="../images/lteq12.gif"> E[<I>X</I>]/<I>t</I> </sub></sup></pre><P>
<h4><a name="074e_12dd">(6.32)<a name="074e_12dd"></sub></sup></h4><P>
for all <I>t</I> > 0.<P>
<a name="074e_12de">6.3-6<a name="074e_12de"><P>
Let <I>S</I> be a sample space, and let <I>X</I> and <I>X</I>' be random variables such that <I>X</I>(<I>s</I>) <img src="../images/gteq.gif"> <I>X</I>'(<I>s</I>) for all <I>s</I> <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> <I>S</I>. Prove that for any real constant<I> t</I>,<P>
<pre>Pr{<I>X </I><img src="../images/gteq.gif"><I> t</I>} <img src="../images/gteq.gif"> Pr{<I>X</I>' <img src="../images/gteq.gif"> <I>t</I>} .</sub></sup></pre><P>
<a name="074e_12df">6.3-7<a name="074e_12df"><P>
Which is larger: the expectation of the square of a random variable, or the square of its expectation?<P>
<a name="074e_12e0">6.3-8<a name="074e_12e0"><P>
Show that for any random variable <I>X</I> that takes on only the values 0 and 1, we have Var [<I>X</I>] = E [<I>X</I>] E [1 - X].<P>
<a name="074e_12e1">6.3-9<a name="074e_12e1"><P>
Prove that Var[<I>aX</I>] = <I>a</I><SUP>2</SUP>Var[<I>x</I>] from the definition (6.29) of variance.<P>
<P>
<P>
<h1><a name="074f_12da">6.4 The geometric and binomial distributions<a name="074f_12da"></h1><P>
<a name="074f_12d7"><a name="074f_12d8"><a name="074f_12d9">A coin flip is an instance of a <I><B>Bernoulli trial</I></B>, which is defined as an experiment with only two possible outcomes: <I><B>success</I></B>, which occurs with probability <I>p,</I> and <I><B>failure</I></B>, which occurs with probability <I>q</I> = 1 - <I>p</I>. When we speak of <I><B>Bernoulli trials</I></B> collectively, we mean that the trials are mutually independent and, unless we specifically say otherwise, that each has the same probability <I>p</I> for success. Two important distributions arise from Bernoulli trials: the geometric distribution and the binomial distribution.<P>
<h2>The geometric distribution</h2><P>
<a name="0750_12da"><a name="0750_12db"><a name="0750_12dc">Suppose we have a sequence of Bernoulli trials, each with a probability <I>p</I> of success and a probability <I>q</I> = 1 - <I>p</I> of failure. How many trials occur before we obtain a success? Let the random variable <I>X</I> be the number of trials needed to obtain a success. Then <I>X</I> has values in the range {1, 2, . . .}, and for <I>k</I> <img src="../images/gteq.gif"> 1,<P>
<pre>Pr{<I>X</I> = <I>k</I>} = <I>q<SUP>k</I>-1</SUP><I>p ,</I></sub></sup></pre><P>
<h4><a name="0750_12de">(6.33)<a name="0750_12de"></sub></sup></h4><P>
since we have <I>k</I> - 1 failures before the one success. A probability distribution satisfying equation (6.33) is said to be a <I><B>geometric distribution</I></B><I>. </I>Figure 6.1 illustrates such a distribution.<P>
<img src="116_a.gif"><P>
<h4><a name="0750_12df">Figure 6.1 A geometric distribution with probability p = 1/3 of success and a probability q = 1 - p of failure. The expectation of the distribution is 1/p = 3.<a name="0750_12df"></sub></sup></h4><P>
Assuming <I>p</I> < 1, the expectation of a geometric distribution can be calculated using identity (3.6):<P>
<img src="116_b.gif"><P>
<h4>(6.34.)</sub></sup></h4><P>
<a name="0750_12dd">Thus, on average, it takes 1/<I>p</I> trials before we obtain a success, an intuitive result. The variance, which can be calculated similarly, is<P>
<pre>Var[<I>X</I>] = <I>q</I>/<I>p</I><SUP>2 </SUP>.</sub></sup></pre><P>
<h4><a name="0750_12e0">(6.35)<a name="0750_12e0"></sub></sup></h4><P>
As an example, suppose we repeatedly roll two dice until we obtain either a seven or an eleven. Of the 36 possible outcomes, 6 yield a seven and 2 yield an eleven. Thus, the probability of success is <I>p</I> = 8/36 = 2/9, and we must roll 1/<I>p</I> = 9/2 = 4.5 times on average to obtain a seven or eleven.<P>
<img src="117_a.gif"><P>
<h4><a name="0750_12e1">Figure 6.2 The binomial distribution b(k; 15, 1/3) resulting from n = 15 Bernoulli trials, each with probability p = 1/3 of success. The expectation of the distribution is np = 5.<a name="0750_12e1"></sub></sup></h4><P>
<P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -