📄 chap8.tex
字号:
\begindisplayH(z)=F(z)@G(z)\,,\enddisplayand our formulas \eq(|pgf1|) through \eq(|pgf-var|) for mean and variancetell us that we must have\begindisplay\Mean(H)&=\Mean(F)+\Mean(G)\,;\eqno\eqref|ind-sum1|\cr\Var(H)&=\Var(F)+\Var(G)\,.\eqno\eqref|ind-sum2|\cr\enddisplayThese formulas, which are properties of the derivatives$\Mean(H)=H'(1)$ and$\Var(H)=H''(1)+H'(1)-H'(1)^2$,aren't valid for arbitrary function products $H(z)=F(z)@G(z)$;we have\begindisplayH'(z)&=F'(z)@G(z)+F(z)@G'(z)\,,\crH''(z)&=F''(z)@G(z)+2F'(z)@G'(z)+F(z)@G''(z)\,.\enddisplayBut if we set $z=1$,we can see that \eq(|ind-sum1|) and \eq(|ind-sum2|)will be valid in general provided only that\begindisplayF(1)=G(1)=1\eqno\enddisplayand that the derivatives exist.The ``probabilities'' don't have to be in $[@0\dts1]$ for these formulas to hold. We can normalize the functions $F(z)$and $G(z)$ by dividing through by $F(1)$ and $G(1)$ in order to makethis condition valid, whenever $F(1)$and $G(1)$ are nonzero.Mean and variance aren't the whole story. They are merely twoof an infinite series of so-called {\it"cumulant"\/} statistics introduced\g I'll graduate magna cum ulant.\gby the Danish astronomer Thorvald Nicolai "Thiele"~[|thiele|] in 1903.The first two cumulants $\kappa_1$ and $\kappa_2$ of a random variableare what we have called the mean and the variance; there also arehigher-order cumulants that express more subtle properties of a distribution.The general formula\begindisplay\ln G(e^t)={\kappa_1\over1!}t+ {\kappa_2\over2!}t^2+ {\kappa_3\over3!}t^3+ {\kappa_4\over4!}t^4+\cdots\eqno\eqref|cum-def|\enddisplaydefines the cumulants of all orders, when $G(z)$ is the pgf of a random variable.Let's look at cumulants more closely.If $G(z)$ is the pgf for $X$, we have\begindisplay\openup3ptG(e^t)&=\sum_{k\ge0}\Pr\prp(X=k)e^{kt}&=\sum_{k,m\ge0}\Pr\prp(X=k){k^mt^m\over m!}\cr&&=1+{\mu_1\over1!}t+{\mu_2\over2!}t^2+{\mu_3\over3!}t^3+\cdots\,,\eqno\cr\enddisplaywhere\begindisplay\mu_m&=\sum_{k\ge0}k^m\Pr\prp(X=k)=E(X^m)\,.}\hidewidth{\eqno\eqref|moment-def|\cr\enddisplayThis quantity $\mu_m$ is called the ``$m$th "moment"'' of\/~$X$. We can takeexponentials on both sides of \eq(|cum-def|), obtaining another formulafor $G(e^t)$:\begindisplay \openup3ptG(e^t)&=1+{(\kappa_1t+\half\kappa_2t^2+\cdots\,)\over1!}+ {(\kappa_1t+\half\kappa_2t^2+\cdots\,)^2\over2!}+\cdots\cr&=1+\kappa_1t+\textstyle\half(\kappa_2+\kappa_1^2)t^2+\cdots\,.\enddisplayEquating coefficients of powers of $t$ leads to a series of formulas%\g You forgot the coefficients of~$t^0$.\g\begindisplay\kappa_1&=\mu_1\,,\eqno\cr\kappa_2&=\mu_2-\mu_1^2\,,\eqno\cr\kappa_3&=\mu_3-3\mu_1\mu_2+2\mu_1^3\,,\eqno\cr\kappa_4&=\mu_4-4\mu_1\mu_3+12\mu_1^2\mu_2-3\mu_2^2-6\mu_1^4\,,\eqno\cr\kappa_5&=\mu_5-5\mu_1\mu_4+20\mu_1^2\mu_3-10\mu_2\mu_3\cr&\hskip7em+30\mu_1\mu_2^2-60\mu_1^3\mu_2+24\mu_1^5\,,\eqno\cr\noalign{\vskip-3pt}&\quad\vdots\enddisplaydefining the cumulants in terms of the moments. Notice that $\kappa_2$is indeed the variance, $E(X^2)-(EX)^2$, as claimed.Equation \eq(|cum-def|) makes it clear that the cumulants defined by\g \noindent\llap{``}For these higher half-invariants we shall propose no special names.''\par\hfill\kern-4pt\dash---T.\thinspace N.\thinspace"Thiele"\thinspace[|thiele|]\gthe product $F(z)@G(z)$ of two pgf's will be the sums of the correspondingcumulants of $F(z)$ and $G(z)$, because logarithms of products are sums. Thereforeall cumulants of the sum of independent random variables are additive, justas the mean and variance are. This property makes cumulants moreimportant than moments.If we take a slightly different tack, writing\begindisplayG(1+t)=1+{\alpha_1\over1!}t+{\alpha_2\over2!}t^2+{\alpha_3\over3!}t^3+\cdots\,,\enddisplayequation \eq(|derivs-at-1|) tells us that the $\alpha$'s are the ``factorialmoments''\begindisplay \openup3pt\alpha_m&=G^{(m)}(1)\cr&=\sum_{k\ge0}\Pr\prp(X=k)k\_m\,z^{k-m}\,\big\vert_{z=1}\cr&=\sum_{k\ge0}k\_m\Pr\prp(X=k)\cr&=E(X\_m)\,.\eqno\enddisplayIt follows that\begindisplay \openup3ptG(e^t)&=1+{\alpha_1\over1!}(e^t-1)+{\alpha_2\over2!}(e^t-1)^2+\cdots\cr&=1+{\alpha_1\over1!}(t+{\textstyle\half} t^2+\cdots\,)+ {\alpha_2\over2!}(t^2+t^3+\cdots\,)+\cdots\cr&=\textstyle1+\alpha_1t+\half(\alpha_2+\alpha_1)t^2+\cdots\,,\enddisplayand we can express the cumulants in terms of the derivatives $G^{(m)}(1)$:\begindisplay\kappa_1&=\alpha_1\,,\eqno\cr\kappa_2&=\alpha_2+\alpha_1-\alpha_1^2\,,\eqno\cr\kappa_3&=\alpha_3+3\alpha_2+\alpha_1-3\alpha_2\alpha_1-3\alpha_1^2+2\alpha_1^3\,,\eqno\cr&\quad\vdots\enddisplayThis sequence of formulas yields ``additive'' identities that extend\eq(|ind-sum1|) and \eq(|ind-sum2|) to all the cumulants.Let's get back down to earth and apply these ideas to simple examples. Thesimplest case of a random variable is a ``random constant,\qback'' where$X$~has a certain fixed value~$x$ with probability~$1$. In this case$G_X(z)=z^x$, and $\ln G_X(e^t)=xt$; hence the mean is~$x$ and all othercumulants are zero. It follows that the operation of multiplying any pgfby $z^x$ increases the mean by~$x$ but leaves the variance and all othercumulants unchanged.How do probability generating functions apply to dice? Thedistribution of spots on one fair die has the pgf\begindisplayG(z)={z+z^2+z^3+z^4+z^5+z^6\over6}=zU_6(z)\,,\enddisplaywhere $U_6$ is the pgf for the uniform distribution of order~$6$. Thefactor~`$z$' adds $1$ to the mean, so the mean is $3.5$ instead of~${n-1\over2}=2.5$ as given in~\eq(|unif-mean|);but an extra `$z$' does not affect the variance \eq(|unif-var|), which equals $35\over12$.The pgf for total spots on two independent dice is the square of the pgf forspots on one die,\begindisplay \advance\medmuskip-.6muG_S(z)&={z^2+2z^3+3z^4+4z^5+5z^6+6z^7+5z^8+4z^9+3z^{10}+2z^{11}+z^{12}\over36}\cr&=z^2U_6(z)^2\,.\enddisplayIf we roll a pair of fair dice $n$ times, the probability that we get a total of $k$~spotsoverall is, similarly,\begindisplay \openup2pt[z^k]\,G_S(z)^n&=[z^k]\,z^{2n}U_6(z)^{2n}\cr&=[z^{k-2n}]\,U_6(z)^{2n}\,.\enddisplayIn the hats-off-to-football-victory problem considered earlier,\g Hat distribution is a different kind of uniform distribution.\gotherwise known as the problem ofenumerating the fixed points of a random permutation, we know from\equ(5.|subfactorial-rec|)that the pgf is\begindisplayF_n(z)=\sum_{0\le k\le n}{(n-k)\?\over(n-k)!}{z^k\over k!}\,,\qquad\hbox{for $n\ge0$}.\eqno\eqref|football-pgf|\enddisplayTherefore\begindisplayF_n'(z)&=\sum_{1\le k\le n}{(n-k)\?\over(n-k)!}{z^{k-1}\over(k-1)!}\cr&=\sum_{0\le k\le n-1}{(n-1-k)\?\over(n-1-k)!}{z^k\over k!}\cr\noalign{\smallskip}&=F_{n-1}(z)\,.\cr\enddisplayWithout knowing the details of the coefficients, we can conclude fromthis recurrence $F'_n(z)=F_{n-1}(z)$ that $F_n^{(m)}(z)=F_{n-m}(z)$; hence\begindisplayF_n^{(m)}(1)=F_{n-m}(1)=\[n\ge m]\,.\eqno\enddisplayThis formula makes it easy to calculate the mean and variance;we find as before (but more quickly) that they are both equalto~$1$ when $n\ge2$.In fact, we can now show that the $m$th cumulant $\kappa_m$ of thisrandom variable is equal to~$1$ whenever $n\ge m$. For the $m$thcumulant depends only on $F'_n(1)$, $F''_n(1)$, \dots,~$F^{(m)}_n(1)$,and these are all equal to~$1$; hence we obtain the same answer forthe $m$th cumulant as we do when we replace $F_n(z)$ by the limiting pgf\begindisplayF_\infty(z)=e^{z-1}\,,\eqno\eqref|football-infty|\enddisplaywhich has $F_\infty^{(m)}(1)=1$ for derivatives of all orders. The cumulantsof $F_\infty$ are identically equal to~$1$, because\begindisplay\ln F_\infty(e^t)=\ln e^{e^t-1}=e^t-1={t\over1!}+{t^2\over2!}+{t^3\over3!}+\cdots\,.\enddisplay\beginsection 8.4 Flipping coinsNow let's turn to processes that have just two outcomes.If we flip a coin, there's probability $p$ that it comes up\g Con artists know that\/ $p\approx0.1$ when you spin anewly minted U.S. penny on a~smooth table. "!cheating"(The weight distribution makes "Lincoln"'s head fall downward.)\gheads and probability $q$ that it comes up tails, where\begindisplayp+q=1\,.\enddisplay(We assume that the coin doesn't come to rest on its edge, orfall into a hole, etc.) Throughout this section, the numbers $p$and~$q$ will always sum to~$1$. If the coin is {\it fair},"!fair coin" "!biased coin"we have $p=q=\half$; otherwise the coin is said to be {\it biased}.The probability generating function for the number of heads afterone toss of a coin is\begindisplayH(z)=q+pz\,.\eqno\enddisplayIf we toss the coin $n$ times, always assuming that different cointosses are independent, the number of headsis generated by\begindisplayH(z)^n=(q+pz)^n=\sum_{k\ge0}{n\choose k}p^kq^{n-k}z^k\,,\eqno\eqref|binom-pgf|\enddisplayaccording tothe binomial theorem. Thus, the chance that we obtain exactly~$k$heads in $n$ tosses is ${n\choose k}p^kq^{n-k}$.This sequence of probabilities iscalled the {\it"binomial distribution"}.Suppose we toss a coin repeatedly until heads first turns up. What isthe probability that exactly $k$~tosses will be required? We have $k=1$with probability~$p$ (since this is the probability of heads onthe first flip); we have $k=2$ with probability $qp$ (since thisis the probability of tails first, then heads); and for general~$k$the probability is $q^{k-1}p$. So the generating function is\begindisplaypz+qpz^2+q^2pz^3+\cdots\,={pz\over1-qz}\,.\eqno\eqref|geom-pgf|\enddisplayRepeating the process until $n$ heads are obtained gives the pgf"!Bernoulli trials, see coin flipping"\begindisplay\left( pz\over1-qz\right)^{\!n}&=p^nz^n\sum_k{n+k-1\choose k}(qz)^k\cr&=\sum_k{k-1\choose k-n}p^nq^{k-n}z^k\,.\eqno\eqref|bernoulli-trials|\enddisplayThis, incidentally, is $z^n$ times\begindisplay\left(p\over1-qz\right)^{\!n}=\sum_k{n+k-1\choose k}p^nq^kz^k\,,\eqno\eqref|negbinom-pgf|\enddisplaythe generating function for the {\it"negative binomial distribution"}.%(The special case $n=1$ of a negative binomial distribution is called%a {\it"geometric distribution"}, because the corresponding generating%function is a geometric series.)The probability space in example \eq(|bernoulli-trials|),where we flip a coin until $n$~heads have appeared, is different from theprobability spaces we've seen earlier in this chapter, because itcontains infinitely many elements. Each element is a finite sequenceof heads and/or tails, containing precisely $n$ heads in all, and\g Heads I win,\par tails you lose.\smallskipNo? OK; tails you lose, heads I~win.\smallskipNo? Well, then,\par heads you lose,\par tails I win.\gending with heads; the probability of such a sequence is $p^nq^{k-n}$,where $k-n$ is the number of tails. Thus, for example, if $n=3$ andif we write $\tt H$ for heads and $\tt T$ for tails, the sequence$\tt THTTTHH$ is an element of the probability space, and itsprobability is $qpqqqpp=p^3q^4$.Let $X$ be a random variable with the binomial distribution \eq(|binom-pgf|),and let $Y$ be a random variable with the negative binomialdistribution \eq(|negbinom-pgf|). These distributions depend on $n$ and~$p$.The mean of\/~$X$ is $nH'(1)=np$, since its pgf is $H(z)^n$; the variance is\begindisplayn\bigl(H''(1)+H'(1)-H'(1)^2\bigr)=n(0+p-p^2)=npq\,.\eqno\eqref|v-binom|\enddisplayThus the standard deviation is $\sqrt{@npq\vphantom1}\,$: If wetoss a coin $n$~times, we expect to get heads about$np\pm\sqrt{@npq\vphantom1}$ times. The mean and variance of $Y$can be found in a similar way: If we let\begindisplayG(z)={p\over1-qz}\,,\enddisplaywe have\begindisplay \openup3ptG'(z)&={pq\over(1-qz)^2}\,,\crG''(z)&={2pq^2\over(1-qz)^3}\,;\enddisplayhence $G'(1)=pq/p^2=q/p$ and $G''(1)=2pq^2\!/p^3=2q^2\!/p^2$. It follows thatthe mean of\/~$Y$ is $nq/p$ and the variance is $nq/p^2$.A simpler way to derive the mean and variance of\/~$Y$ is to use thereciprocal generating function\begindisplayF(z)={1-qz\over p}={1\over p}-{q\over p}z\,,\eqno\enddisplayand to write\begindisplayG(z)^n=F(z)^{-n}\,.\eqno\enddisplayThis polynomial $F(z)$ is not a probability generating function, becauseit has a negative coefficient. But it does satisfy the crucial condition$F(1)=1$.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -