⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 chap8.tex

📁 Concrete_Mathematics_2nd_Ed_TeX_Source_Code
💻 TEX
📖 第 1 页 / 共 5 页
字号:
% Chapter 8 of Concrete Mathematics% (c) Addison-Wesley, all rights reserved.\input gkpmac\refin bib\refin chap2\refin chap5\refin chap6\refin chap7\refin chap9\pageno=381\beginchapter 8 Discrete ProbabilityTHE ELEMENT OF CHANCE enters into many of our attempts to understand the worldwe live in. A mathematical {\it"theory of probability"\/} allows us to"!probability, theory of"calculate the likelihood of complex events if we assume that the eventsare governed by appropriate axioms. This theory has significant applicationsin all branches of science, and it has strong connections with thetechniques we have studied in previous chapters.Probabilities are called ``"discrete"'' if we can compute the probabilitiesof all events by summation instead of by integration. We are getting pretty goodat sums, so it should come as no great surprise that we are ready toapply our knowledge to some interesting calculations of probabilitiesand averages.\def\dieone{\die{\diedot22}}\def\dietwo{\die{\diedot11\diedot33}}\def\diethree{\die{\diedot11\diedot22\diedot33}}\def\diefour{\die{\diedot11\diedot33\diedot13\diedot31}}\def\diefive{\die{\diedot11\diedot22\diedot33\diedot13\diedot31}}\def\diesix{\die{\diedot11\diedot12\diedot13\diedot31\diedot32\diedot33}}\def\dieseven{\die{\diedot11\diedot12\diedot13\diedot31\diedot32\diedot33\diedot22}}\def\dieeight{\die{\diedot11\diedot12\diedot13\diedot31\diedot32\diedot33% \diedot21\diedot23}}\def\die#1{{\unitlength=2.5pt\beginpicture(5,4)(-.5,1) \put(0,0){\line(0,1)4} \put(0,0){\line(1,0)4} \put(0,4){\line(1,0)4} \put(4,0){\line(0,1)4}#1\endpicture}}\def\diedot#1#2{\put(#1,#2){\disk1}}\vskip-1.1pt % have to make this page slightly tight! (DEK May 88)\beginsection 8.1 DefinitionsProbability theory starts with the idea of a {\it"probability space"},\g(Readers unfamiliar with probability theory will, with highprobability, benefit from a perusal of "Feller"'s classicintroduction to the subject [|feller|].)\gwhich is a set~$\Omega$ of all things that can happen in a givenproblem together with a rule that assigns a probability $\Pr(\omega)$to each elementary event $\omega\in\Omega$. The probability $\Pr(\omega)$must be a nonnegative real number, and the condition\begindisplay \advance\belowdisplayskip-1.2pt\sum_{\omega\in\Omega}\Pr(\omega)=1\eqno\eqref|all-probs|\enddisplaymust hold in every discrete probability space. Thus, each value $\Pr(\omega)$must lie in the interval $[@0\dts1]$.We speak of Pr as a {\it"probability distribution"}, because it distributesa total probability of~$1$ among the events~$\omega$.Here's an example: If we'rerolling a pair of "dice", the set~$\Omega$of elementary events is $D^2=\{\,\dieone\dieone,\,\dieone\dietwo,\,\ldots,\,\diesix\diesix\,\}$, where\begindisplayD=\{\,\dieone,\,\dietwo,\,\diethree,\,\diefour,\,\diefive,\,\diesix\,\}\enddisplayis the set of all six ways that a given die can land. Two rolls such as\g Never say die.\g\smash{\dieone\dietwo} and \smash{\dietwo\dieone}are considered to be distinct;hence this probability space has a total of $6^2=36$ elements.We usually assume that dice are ``fair''\dash---that each of thesix possibilities for a particular die has probability~$1\over6$, and that"!fair dice" "!loaded dice"each of the 36 possible rolls in~$\Omega$ has probability~$1\over36$.\g Careful: They might go off.\gBut we can also consider ``loaded'' dice in which there is a differentdistribution of probabilities. For example, let\begindisplay\Pr_1(\dieone)&=\Pr_1(\diesix)=\textstyle{1\over4}\,;\cr\Pr_1(\dietwo)&=\Pr_1(\diethree)=\Pr_1(\diefour)=\Pr_1(\diefive)=\textstyle{1\over8}\,.\cr\enddisplayThen $\sum_{d\in D}\Pr_1(d)=1$, so $\Pr_1$ is a probability distributionon the set~$D$, and we can assign probabilities to theelements of $\Omega=D^2$ by the rule\begindisplay\Pr_{11}(d\,d')=\Pr_1(d)\,\Pr_1(d')\,.\eqno\eqref|pr1-def|\enddisplayFor example, $\Pr_{11}(\diesix\diethree)={1\over4}\cdt{1\over8}={1\over32}$.This is a valid distribution because\begindisplay \openup3pt\sum_{\omega\in\Omega}\Pr_{11}(\omega)&=\sum_{dd'\in D^2}\Pr_{11}(d\,d') =\sum_{d,d'\in D}\Pr_1(d)\,\Pr_1(d')\cr&=\sum_{d\in D}\Pr_1(d)\,\sum_{d'\in D}\Pr_1(d')=1\cdot1=1\,.\cr\enddisplayWe can also consider the case of one fair die and one loaded die,\begindisplay\Pr_{01}(d\,d')=\Pr_0(d)\,\Pr_1(d')\,,\qquad\hbox{where $\Pr_0(d)={1\over6}$},\eqno\eqref|pr2-def|\enddisplayin which case $\Pr_{01}(\diesix\diethree)={1\over6}\cdt{1\over8}={1\over48}$.Dice in the ``real world'' can't really be expected to turn up equally often\g If all sides of a cube were identical, how could we tell whichside is face up?\gon each side, because they aren't perfectly symmetrical; but $1\over6$ isusually pretty close to the truth.An {\it"event"\/} is a subset of $\Omega$. In dice games, for example,the set\begindisplay\{\,\dieone\dieone,\,\dietwo\dietwo,\,\diethree\diethree,\,\diefour\diefour,\,\diefive\diefive,\,\diesix\diesix\,\}\enddisplayis the event that ``doubles are thrown.\qback''  The individual elements~$\omega$ of~$\Omega$are called {\it"elementary events"\/} because they cannot be decomposedinto smaller subsets; we can think of $\omega$ as a one-elementevent~$\{\omega\}$.The probability of anevent~$A$ is defined by the formula\begindisplay\Pr\prp(\omega\in A)=\sum_{\omega\in A}\Pr(\omega)\,;\eqno\eqref|prob-def|\enddisplayand in general if $R(\omega)$ is any statement about $\omega$, we write`$\Pr\bigl(R(\omega)\bigr)$' for the sum of all $\Pr(\omega)$ such that$R(\omega)$ is true. Thus, for example, theprobability of doubles with fair dice is ${1\over36}+{1\over36}+{1\over36}+{1\over36}+{1\over36}+{1\over36}={1\over6}$; butwhen both dice are loaded with probability distribution~$\Pr_1$it is ${1\over16}+{1\over64}+{1\over64}+{1\over64}+{1\over64}+{1\over16}={3\over16}>{1\over6}$. Loading the dice makesthe event ``doubles are thrown'' more probable.(We have been using $\sum$-notation in a more general sense here thandefined in Chapter~2: The sums in\eq(|all-probs|) and \eq(|prob-def|) occur over all elements~$\omega$ of anarbitrary set, not over integers only. However, this new developmentis not really alarming; we can agree touse special notation under a $\sum$whenever nonintegers are intended, so there will be no confusion withour ordinary conventions. The other definitions in Chapter~2 are still valid;in particular, the definition of infinite sums in that chaptergives the appropriateinterpretation to our sums whenthe set~$\Omega$ is infinite. Each probability is nonnegative, and thesum of all probabilities is bounded, so the probability of event~$A$ in\eq(|prob-def|) is well defined for all subsets $A\subseteq\Omega$.)A {\it"random variable"\/} is a function defined on the elementaryevents~$\omega$ of a probability space. For example, if $\Omega=D^2$we can define  $S(\omega)$ to be the sum of the spots on the dice roll~$\omega$,so that $S(\diesix\diethree)=6+3=9$. The probability that the spots total sevenis the probability of the event $S(\omega)=7$, namely\begindisplay \openup3pt&\Pr(\dieone\diesix)+\Pr(\dietwo\diefive)+\Pr(\diethree\diefour)\cr&\qquad+\Pr(\diefour\diethree)+\Pr(\diefive\dietwo)+\Pr(\diesix\dieone)\,.\enddisplayWith fair dice ($\Pr=\Pr_{00}$), this happens with probability~$1\over6$;but with loaded dice ($\Pr=\Pr_{11}$), it happens with probability${1\over16}+{1\over64}+{1\over64}+{1\over64}+{1\over64}+{1\over16}={3\over16}$, the same as we observedfor doubles.It's customary to drop the `$(\omega)$' when we talk about random variables,because there's usually only one probability space involved when we'reworking on any particular problem. Thus we say simply `$S=7$' for theevent that a $7$ was rolled, and `$S=4$' for the event$\{\,\dieone\diethree,\,\dietwo\dietwo,\,\diethree\dieone\,\}$.A random variable can be characterized by the probability distribution ofits values. Thus, for example, $S$~takes on eleven possible values$\{2,3,\ldots,12\}$, and we can tabulate the probability that $S=s$for each $s$ in this set:\begindisplay \let\preamble=\tablepreamble \offinterlineskips&&2&3&4&5&6&7&8&9&10&11&12&\kern-1em\kern1pt\cr\noalign{\hrule}\omit&height 1pt\cr\Pr_{00}\prp(S=s)&&{1\over36}&{2\over36}&{3\over36}&{4\over36}& {5\over36}&{6\over36}&{5\over36}&{4\over36}& {3\over36}&{2\over36}&{1\over36}\cr\omit&height 2pt\cr\noalign{\hrule}\Pr_{11}\prp(S=s)&&{4\over64}&{4\over64}&{5\over64}&{6\over64}& {7\over64}&{12\over64}&{7\over64}&{6\over64}& {5\over64}&{4\over64}&{4\over64}\cr\enddisplayIf we're working on a problem that involves only the random variable~$S$and no other properties of dice, we can compute the answer from theseprobabilities alone, without regard to the details of the set $\Omega=D^2$.In fact, we could define the probability space to be the smallerset $\Omega=\{2,3,\ldots,12\}$, with whatever probability distribution$\Pr(s)$ is desired. Then `$S=4$' would be an elementary event.Thus we can often ignore the underlying probability space~$\Omega$and work directly with random variables and their distributions.If two random variables $X$ and $Y$ are defined over the same probabilityspace~$\Omega$, we can characterize their behavior without knowingeverything about~$\Omega$ if we know the ``"joint distribution"''\g Just Say No.\g\begindisplay\Pr\prp(X=x\ {\rm and}\ Y=y)\enddisplayfor each $x$ in the range of\/~$X$ and each $y$ in the range of\/~$Y$."!independent random variables"We say that $X$ and $Y$ are {\it independent\/} random variables if\begindisplay\Pr\prp(X=x\ {\rm and}\ Y=y)=\Pr\prp(X=x)\cdot\Pr\prp(Y=y)\eqno\eqref|indep-def|\enddisplayfor all $x$ and $y$.  Intuitively, this means that the value of\/~$X$has no effect on the value of\/~$Y$.For example, if $\Omega$ is the set of dice rolls $D^2$, we can let$S_1$ be the number of spots on the first die and $S_2$ the numberof spots on the second. Then the random variables $S_1$ and~$S_2$are independent with respect to each of the probability distributions$\Pr_{00}$, $\Pr_{11}$, and $\Pr_{01}$ discussed earlier, because we defined the dice probability for each elementaryevent~$dd'$ as a product of a probability for~$S_1=d$ multiplied bya probability for~$S_2=d'$. We could have defined probabilities differentlyso that, say,\g\vskip19pt A dicey inequality.\g\begindisplay\Pr(\dieone\diefive)\,\big/\,\Pr(\dieone\diesix) \ne\Pr(\dietwo\diefive)\,\big/\,\Pr(\dietwo\diesix)\,;\enddisplaybut we didn't do that, because different dice aren't supposed toinfluence each other. With our definitions, both of these ratios are$\Pr\prp(S_2=5)/\Pr\prp(S_2=6)$.We have defined $S$ to be the sum of the two spot values, $S_1+S_2$.Let's consider another random variable $P$, the product $S_1S_2$.Are $S$ and~$P$ independent? Informally, no; if we are told that$S=2$, we know that $P$ must be~$1$. Formally, no again, becausethe independence condition \eq(|indep-def|) fails spectacularly (at leastin the case of fair dice):For all legal values of $s$ and~$p$, we have $0<\Pr_{00}\prp(S=s)\cdt\Pr_{00}\prp(P=p)\le{1\over6}\cdt{1\over9}$; this can't equal$\Pr_{00}\prp(S=s\,{\rm and}\,P=p)$,which is a multiple of~$1\over36$.If we want to understand the typical behavior of a given random variable,we often ask about its ``average'' value. But the notion of ``average''is ambiguous; people generally speak about three different kinds ofaverages when a sequence of numbers is given:\smallskip\item{$\bullet$}the {\it"mean"\/} (whichis the sum of all values, divided by the number of values); \item{$\bullet$}the {\it"median"\/} (whichis the middle value, numerically); \item{$\bullet$}the {\it"mode"\/} (whichis the value that occurs most often).\smallskip\noindentFor example, the mean of $(3,1,4,1,5)$ is ${3+1+4+1+5\over5}=2.8$; themedian is~$3$; the mode is~$1$.But probability theorists usually work with random variables instead of withsequences of numbers, so we want to define the notion of an ``average''for random variables too.Suppose we repeat an experiment over and over again, making independenttrials in such a way that each value of\/~$X$ occurs with a frequencyapproximately proportional to its probability. (For example, we mightroll a pair of dice many times, observing the values of $S$ and/or $P$.)We'd like to define the average value of a random variable so that suchexperiments will usually produce a sequence of numbers whose mean, median,or mode is approximately the same as the mean, median, or mode of\/~$X$,according to our definitions. Here's how it can be done: The {\it"mean"\/}of a random real-valued variable\/~$X$ on a probability space~$\Omega$is defined to be\begindisplay\sum_{x\in X(\Omega)}x\cdt\Pr\prp(X=x)\eqno\eqref|mean1|\enddisplayif this potentially infinite sum exists.(Here $X(\Omega)$ stands for the set of all values that $X$ can assume.)The {\it"median"\/} of\/~$X$ is defined to be the set of all~$x$ such that\begindisplay\Pr\prp(X\le x)\ge\half\And\Pr\prp(X\ge x)\ge\half\,.\eqno\eqref|median1|\enddisplayAnd the {\it"mode"\/} of\/~$X$ is defined to be the set of all~$x$ such that\begindisplay\Pr\prp(X=x)\ge\Pr\prp(X=x')\qquad\hbox{for all $x'\in X(\Omega)$}.\eqno\enddisplayIn our dice-throwing example, the mean of $S$ turns out to be$2\cdt{1\over36}+3\cdt{2\over36}+\cdots+12\cdt{1\over36}=7$ in distribution$\Pr_{00}$, and it also turns out to be~$7$ in distribution~$\Pr_{11}$.The median and mode both turn out to be $\{7\}$ as well, in bothdistributions. So $S$ has the same average under all three definitions.On the other hand the$P$ in distribution~$\Pr_{00}$ turns out to have a mean value of ${49\over4}=12.25$; its medianis~$\{10\}$, and its mode is $\{6,12\}$. The mean of\/~$P$ is unchanged ifwe load the dice with distribution $\Pr_{11}$, but the median drops to~$\{8\}$and the mode becomes $\{6\}$ alone.Probability theorists have a special name and notation for the meanof a random variable: They call it the {\it"expected value"}, and write\begindisplayEX=\sum_{\omega\in\Omega}X(\omega)\Pr(\omega)\,.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -