📄 chap06.htm

📁 程序设计的艺术,此书是英文版的,经久不衰,非常的经典
💻 HTM
📖 第 1 页 / 共 5 页
字号:




<h2>Conditional probability and independence</h2><P>
<a name="0748_12c1"><a name="0748_12c2">Sometimes we have some prior partial knowledge about the outcome of an experiment. For example, suppose that a friend has flipped two fair coins and has told you that at least one of the coins showed a head. What is the probability that both coins are heads? The information given eliminates the possibility of two tails. The three remaining elementary events are equally likely, so we infer that each occurs with probability 1/3. Since only one of these elementary events shows two heads, the answer to our question is 1/3.<P>
Conditional probability formalizes the notion of having prior partial knowledge of the outcome of an experiment. The<I> <B>conditional probability</I></B> of an event <I>A</I> given that another event <I>B</I> occurs is defined to be<P>
<img src="107_b.gif"><P>
<h4><a name="0748_12c5">(6.19)<a name="0748_12c5"></sub></sup></h4><P>
whenever Pr{<I>B</I>} <img src="../images/noteq.gif"> 0. (We read &quot;Pr{<I>A</I> <img src="../images/sglvrt.gif"> <I>B</I>}&quot; as &quot;the probability of <I>A</I> given <I>B</I>.&quot;) Intuitively, since we are given that event <I>B</I> occurs, the event that <I>A</I> also occurs is <I>A </I><img src="../images/dome.gif"><I></I> <I>B</I>. That is, <I>A</I> <img src="../images/dome.gif"> <I>B</I> is the set of outcomes in which both <I>A</I> and <I>B</I> occur. Since the outcome is one of the elementary events in <I>B</I>, we normalize the probabilities of all the elementary events in <I>B</I> by dividing them by Pr{<I>B</I>}, so that they sum to 1. The conditional probability of <I>A</I> given <I>B</I> is, therefore, the ratio of the probability of event <I>A</I> <img src="../images/dome.gif"> <I>B</I> to the probability of event <I>B</I>. In the example above, <I>A</I> is the event that both coins are heads, and <I>B</I> is the event that at least one coin is a head. Thus, Pr{<I>A</I> <img src="../images/sglvrt.gif"> <I>B</I>} = (1/4)/(3/4) = 1/3.<P>
Two events are <I><B>independent</I></B> if<P>
<pre>Pr{<I>A </I><img src="../images/dome.gif"> <I>B</I>} = Pr{<I>A</I>}Pr{<I>B</I>},</sub></sup></pre><P>
which is equivalent, if Pr{<I>B</I>} <img src="../images/noteq.gif"> 0, to the condition<P>
<pre>Pr{<I>A</I><img src="../images/sglvrt.gif"><I>B</I>} = Pr{<I>A</I>}.</sub></sup></pre><P>
For example, suppose that two fair coins are flipped and that the outcomes are independent. Then the probability of two heads is (1/2)(1/2) = 1/4. Now suppose that one event is that the first coin comes up heads and the other event is that the coins come up differently. Each of these events occurs with probability 1/2, and the probability that both events occur is 1/4; thus, according to the definition of independence, the events are independent--even though one might think that both events depend on the first coin. Finally, suppose that the coins are welded together so that they both fall heads or both fall tails and that the two possibilities are equally likely. Then the probability that each coin comes up heads is 1/2, but the probability that they both come up heads is 1/2 <img src="../images/noteq.gif"> (1/2)(1/2). Consequently, the event that one comes up heads and the event that the other comes up heads are not independent.<P>
<a name="0748_12c3">A collection <I>A</I><SUB>1,</SUB><I> A</I><SUB>2</SUB>,<I> </I>. . . ,<I> A<SUB>n</I></SUB> of events is said to be <I><B>pairwise independent</I></B> if<P>
<pre>Pr{<I>A<SUB>i</I></SUB> <img src="../images/dome.gif"> <I>A<SUB>j</I></SUB>} = Pr{<I>A<SUB>i</I></SUB>} Pr{<I>A<SUB>j</I></SUB>}</sub></sup></pre><P>
<a name="0748_12c4">for all 1 <img src="../images/lteq12.gif"> <I>i</I> &lt; <I>j</I> <img src="../images/lteq12.gif"> <I>n</I>. We say that they are <I><B>(mutually) independent</I></B><I> </I>if every <I>k</I>-subset <I>A<SUB>i</I>1</SUB>,<I>A<SUB>i</I>2</SUB>,...,<I>A<SUB>ik</I></SUB> of the collection, where 2 <img src="../images/lteq12.gif"> <I>k</I> <img src="../images/lteq12.gif"> <I>n</I> and 1 <img src="../images/lteq12.gif"> <I>i</I><SUB>1</SUB> &lt; <I>i</I><SUB>2</SUB> &lt; <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> &lt; <I>i<SUB>k</I></SUB> <img src="../images/lteq12.gif"><I>n</I>, satisfies<P>
<pre>Pr{<I>A<SUB>i</I>1</SUB> <img src="../images/dome.gif"> <I>A<SUB>i</I>2</SUB> <img src="../images/dome.gif"> <img src="../images/dot10.gif"><img src="../images/dot10.gif"><img src="../images/dot10.gif"> <img src="../images/dome.gif"> <I>A<SUB>ik</I></SUB>} =  Pr{<I>A<SUB>i</I>1</SUB>}Pr{<I>A<SUB>i</I>2</SUB>} <img src="../images/dot10.gif"><img src="../images/dot10.gif"><img src="../images/dot10.gif"> Pr{<I>A<SUB>i</I>k</SUB>}.</sub></sup></pre><P>
For example, suppose we flip two fair coins. Let <I>A</I><SUB>1</SUB> be the event that the first coin is heads, let A<SUB>2</SUB> be the event that the second coin is heads, and let <I>A</I><SUB>3</SUB> be the event that the two coins are different. We have<P>
<pre>          Pr{<I>A</I><SUB>1</SUB>}  =  1/2 ,</sub></sup></pre><P>
<pre>          Pr{<I>A</I><SUB>2</SUB>}  =  1/2 ,</sub></sup></pre><P>
<pre>          Pr{<I>A</I><SUB>3</SUB>}  =  1/2 ,</sub></sup></pre><P>
<pre>     Pr{<I>A</I><SUB>1 </SUB><img src="../images/dome.gif"> <I>A</I><SUB>2</SUB>}  =  1/4 ,</sub></sup></pre><P>
<pre>     Pr{<I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>3</SUB>}  =  1/4 ,</sub></sup></pre><P>
<pre>     Pr{<I>A</I><SUB>2</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>3</SUB>}  =  1/4 ,</sub></sup></pre><P>
<pre>Pr{<I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>2</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>3</SUB>}  =  0.</sub></sup></pre><P>
Since for 1 <img src="../images/lteq12.gif"> <I>i</I> &lt; <I>j</I> <img src="../images/lteq12.gif"> 3, we have Pr{<I>A<SUB>i</I></SUB> <img src="../images/dome.gif"> <I>A<SUB>j</I></SUB>} = Pr{<I>A<SUB>i</I></SUB>}Pr{<I>A<SUB>j</I></SUB>} = 1/4, the events <I>A</I><SUB>1</SUB>, <I>A</I><SUB>2</SUB>, and <I>A</I><SUB>3</SUB> are pairwise independent. The events are not mutually independent, however, because Pr{<I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>2</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>3</SUB>} = 0 and Pr{<I>A</I><SUB>1</SUB>} Pr{<I>A</I><SUB>2</SUB>} Pr{<I>A</I><SUB>3</SUB>} = 1/8 <img src="../images/noteq.gif"> 0.<P>
<P>







<h2>Bayes's theorem</h2><P>
<a name="0749_12c5"><a name="0749_12c6">From the definition of conditional probability (6.19), it follows that for two events <I>A</I> and <I>B</I>, each with nonzero probability,<P>
<pre>Pr{<I>A</I> <img src="../images/dome.gif"> <I>B</I>} = Pr{<I>B</I>}Pr{<I>A</I>|<I>B</I>}</sub></sup></pre><P>
<h4><a name="0749_12c7">(6.20)<a name="0749_12c7"></sub></sup></h4><P>
<pre>= Pr{<I>A</I>}Pr{<I>B</I>|<I>A</I>}.</sub></sup></pre><P>
Solving for Pr{<I>A</I> |<I>B</I>}, we obtain<P>
<img src="109_a.gif"><P>
<h4><a name="0749_12c8">(6.21)<a name="0749_12c8"></sub></sup></h4><P>
which is known as <I><B>Bayes's theorem</I></B>. The denominator Pr {<I>B</I>} is a normalizing constant that we can reexpress as follows. Since <img src="109_b.gif"> and <I>B</I> <img src="../images/dome.gif"> <I>A</I> and <img src="109_c.gif"> are mutually exclusive events,<P>
<img src="109_d.gif"><P>
Substituting into equation (6.21), we obtain an equivalent form of Bayes's theorem:<P>
<img src="109_e.gif"><P>
Bayes's theorem can simplify the computing of conditional probabilities. For example, suppose that we have a fair coin and a biased coin that always comes up heads. We run an experiment consisting of three independent events: one of the two coins is chosen at random, the coin is flipped once, and then it is flipped again. Suppose that the chosen coin comes up heads both times. What is the probability that it is biased?<P>
We solve this problem using Bayes<FONT FACE="CG Times (W1)" SIZE=2>'</FONT>s theorem. Let <I>A</I> be the event that the biased coin is chosen, and let <I>B</I> be the event that the coin comes up heads both times. We wish to determine Pr{<I>A|B</I>}. We have Pr{<I>A</I>} = 1/2, <img src="109_f.gif"> hence,<P>
<img src="109_g.gif"><P>
<P>







<h2><a name="074a_12ca">Exercises<a name="074a_12ca"></h2><P>
<a name="074a_12cb">6.2-1<a name="074a_12cb"><P>
<a name="074a_12c7">Prove <I><B>Boole's inequality:</I></B> For any finite or countably infinite sequence of events <I>A</I><SUB>1</SUB>, <I>A</I><SUB>2</SUB>,<I> . . . </I>,<P>
<pre>Pr{<I>A</I><SUB>1 </SUB><img src="../images/wideu.gif"> <I>A</I><SUB>2 </SUB><img src="../images/wideu.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif">} <img src="../images/lteq12.gif"><I> </I>Pr{<I>A</I><SUB>1</SUB>} + Pr{<I>A</I><SUB>2</SUB>}+ <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> .</sub></sup></pre><P>
<h4><a name="074a_12cc">(6.22)<a name="074a_12cc"></sub></sup></h4><P>
<a name="074a_12cd">6.2-2<a name="074a_12cd"><P>
Professor Rosencrantz flips one fair coin. Professor Guildenstern flips two fair coins. What is the probability that Professor Rosencrantz obtains more heads than Professor Guildenstern?<P>
<a name="074a_12ce">6.2-3<a name="074a_12ce"><P>
A deck of 10 cards, each bearing a distinct number from 1 to 10, is shuffled to mix the cards thoroughly. Three cards are removed one at a time from the deck. What is the probability that the three cards are selected in sorted (increasing) order?<P>
<a name="074a_12cf">6.2-4<a name="074a_12cf"><P>
You are given a biased coin, that when flipped, produces a head with (unknown) probability <I>p</I>, where 0 &lt; <I>p</I> &lt; 1. Show how a fair &quot;coin flip&quot; can be simulated by looking at multiple flips. (<I>Hint</I>: Flip the coin twice and then either output the result of the simulated fair flip or repeat the experiment.) Prove that your answer is correct.<P>
<a name="074a_12d0">6.2-5<a name="074a_12d0"><P>
Describe a procedure that takes as input two integers <I>a</I> and <I>b</I> such that 0 &lt; <I>a</I> &lt; <I>b</I> and, using fair coin flips, produces as output heads with probability <I>a</I>/<I>b</I> and tails with probability (<I>b</I> - <I>a</I>)/<I>b</I>. Give a bound on the expected number of coin flips, which should be polynomial in lg <I>b</I>.<P>
<a name="074a_12d1">6.2-6<a name="074a_12d1"><P>
Prove that<P>
<img src="110_a.gif"><P>
<a name="074a_12d2">6.2-7<a name="074a_12d2"><P>
Prove that for any collection of events <I>A</I><SUB>1</SUB>, <I>A</I><SUB>2</SUB>, . . . , <I>A<SUB>n</I></SUB>,<P>
<pre>Pr{<I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>2</SUB> <img src="../images/dome.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dome.gif"> <I>A<SUB>n</I></SUB>} = Pr{<I>A</I><SUB>1</SUB>} <img src="../images/dot10.gif"> Pr{<I>A</I><SUB>2 </SUB>| <I>A</I><SUB>1</SUB>} <img src="../images/dot10.gif"> Pr{<I>A</I><SUB>3 </SUB>| <I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>2</SUB>}<img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"></sub></sup></pre><P>
<pre>Pr{<I>A<SUB>n</I></SUB> | <I>A</I><SUB>1</SUB> <img src="../images/dome.gif"> <I>A</I><SUB>2</SUB> <img src="../images/dome.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dot10.gif"> <img src="../images/dome.gif"> <I>A<SUB>n</I>-1</SUB>}.</sub></sup></pre><P>
<a name="074a_12d3">6.2-8<a name="074a_12d3"><P>
Show how to construct a set of <I>n</I> events that are pairwise independent but such that any subset of <I>k</I> &gt; 2 of them are <I>not</I> mutually independent.<P>
<a name="074a_12d4">6.2-9<a name="074a_12d4"><P>
<a name="074a_12c8"><a name="074a_12c9">Two events <I>A</I> and <I>B</I> are <I><B>conditionally independent</I></B>, given <I>C</I>, if<P>
<pre>Pr{<I>A</I> <img src="../images/dome.gif"> <I>B</I> | <I>C</I>} = Pr{<I>A</I> | <I>C</I>} <img src="../images/dot10.gif"> Pr{<I>B</I> | <I>C</I>} .</sub></sup></pre><P>
Give a simple but nontrivial example of two events that are not independent but are conditionally independent given a third event.<P>
<a name="074a_12d5">6.2-10<a name="074a_12d5"><P>
You are a contestant in a game show in which a prize is hidden behind one of three curtains. You will win the prize if you select the correct curtain. After you have picked one curtain but before the curtain is lifted, the emcee lifts one of the other curtains, revealing an empty stage, and asks if you would like to switch from your current selection to the remaining curtain. How will your chances change if you switch?<P>
<a name="074a_12d6">6.2-11<a name="074a_12d6"><P>
A prison warden has randomly picked one prisoner among three to go free. The other two will be executed. The guard knows which one will go free but is forbidden to give any prisoner information regarding his status. Let us call the prisoners <I>X</I>, <I>Y</I>, and <I>Z</I>. Prisoner <I>X</I> asks the guard privately which of <I>Y</I> or <I>Z</I> will be executed, arguing that since he already knows that at least one of them must die, the guard won't be revealing any information about his own status. The guard tells <I>X</I> that <I>Y</I> is to be executed. Prisoner <I>X</I> feels happier now, since he figures that either he or prisoner <I>Z</I> will go free, which means that his probability of going free is now 1/2. Is he right, or are his chances still 1/3? Explain.<P>
<P>


<P>







<h1><a name="074b_12d0">6.3 Discrete random variables<a name="074b_12d0"></h1><P>
<a name="074b_12ca"><a name="074b_12cb"><a name="074b_12cc">A<B> </B><I><B>(</I>discrete</B><I>)</I> <B>random variable</B> <I>X</I> is a function from a finite or countably infinite sample space <I>S</I> to the real numbers. It associates a real number with each possible outcome of an experiment, which allows us to work with the probability distribution induced on the resulting set of numbers. Random variables can also be defined for uncountably infinite sample spaces, but they raise technical issues that are unnecessary to address for our purposes. Henceforth, we shall assume that random variables are discrete.<P>
<a name="074b_12cd">For a random variable <I>X</I> and a real number <I>x</I>, we define the event <I>X</I> = <I>x </I>to be {<I>s</I> <img src="../images/memof12.gif"> <I>S</I> : <I>X</I> (<I>s</I>) = <I>x</I>}; thus,<P>
<img src="111_a.gif"><P>
The function<P>
<pre><I>f</I>(<I>x</I>) = Pr{<I>X</I> = <I>x</I>}</sub></sup></pre><P>
is the <I><B>probability density function</I></B> of the random variable <I>X</I>. From the probability axioms, Pr{<I>X</I> = <I>x</I>} <img src="../images/gteq.gif"> 0 and <img src="../images/sum14.gif"><B><I><SUB>x</I></SUB> Pr{<I>X</I> = <I>x</I>} = 1.</B><P>
As an example, consider the experiment of rolling a pair of ordinary, 6-sided dice. There are 36 possible elementary events in the sample space. We assume that the probability distribution is uniform, so that each elementary event <I>s</I> <img src="../images/memof12.gif"> <I>S</I> is equally likely: Pr {<I>s</I>} = 1/36. Define the random variable <I>X</I> to be the <I>maximum</I> of the two values showing on the dice. We have Pr {<I>X</I> = 3} = 5/36, since <I>X</I> assigns a value of 3 to 5 of the 36 possible elementary events, namely, (1, 3), (2, 3), (3, 3), (3, 2), and (3, 1).<P>
<a name="074b_12ce">It is common for several random variables to be defined on the same sample space. If <I>X</I> and <I>Y</I> are random variables, the function<P>
<pre><I>f</I>(<I>x</I>,<I>y</I>) = Pr{<I>X</I> = <I>x</I> and <I>Y</I> = <I>y</I>}</sub></sup></pre><P>
is the <I><B>joint probability density function</I></B> of <I>X</I> and <I>Y</I>. For a fixed value <I>y</I>,<P>
<img src="111_b.gif"><P>
and similarly, for a fixed value <I>x</I>,<P>
<img src="111_c.gif"><P>
Using the definition (6.19) of conditional probability, we have<P>
<img src="112_a.gif"><P>
<a name="074b_12cf">We define two random variables <I>X</I> and <I>Y</I> to be <I><B>independent</I></B> if for all <I>x </I>and <I>y</I>, the events <I>X</I> = <I>x</I> and <I>Y</I> = <I>y</I> are independent or, equivalently, if for all <I>x</I> and <I>y</I>, we have Pr{<I>X</I> = <I>x</I> and <I>Y</I> = <I>y</I>} = Pr{<I>X</I> = <I>x</I>} Pr{<I>Y</I> = <I>y</I>}.<P>
Given a set of random variables defined over the same sample space, one can define new random variables as sums, products, or other functions of the original variables.<P>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -