📄 node20.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><!--Converted with LaTeX2HTML 2002-2 (1.70)original version by: Nikos Drakos, CBLU, University of Leeds* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan* with significant contributions from: Jens Lippmann, Marek Rouchal, Martin Wilck and others --><HTML><HEAD><TITLE>The Smoothed Bootstrap</TITLE><META NAME="description" CONTENT="The Smoothed Bootstrap"><META NAME="keywords" CONTENT="web1"><META NAME="resource-type" CONTENT="document"><META NAME="distribution" CONTENT="global"><META NAME="Generator" CONTENT="LaTeX2HTML v2002-2"><META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css"><LINK REL="STYLESHEET" HREF="web1.css"><LINK REL="previous" HREF="node19.html"><LINK REL="up" HREF="node6.html"></HEAD><BODY ><!--Navigation Panel--><IMG WIDTH="81" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next_inactive" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/nx_grp_g.png"> <A NAME="tex2html422" HREF="node6.html"><IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/up.png"></A> <A NAME="tex2html420" HREF="node19.html"><IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/prev.png"></A> <BR><B> Up:</B> <A NAME="tex2html423" HREF="node6.html">Lectures</A><B> Previous:</B> <A NAME="tex2html421" HREF="node19.html">Confidence Intervals</A><BR><BR><!--End of Navigation Panel--><!--Table of Child-Links--><A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></A><UL><LI><A NAME="tex2html424" HREF="node20.html#SECTION002141000000000000000">Smoothing- a crash course</A><LI><A NAME="tex2html425" HREF="node20.html#SECTION002142000000000000000">Smoothing for variance stabilization</A></UL><!--End of Table of Child-Links--><HR><H1><A NAME="SECTION002140000000000000000">The Smoothed Bootstrap</A></H1>We have seen how the parametric bootstrap and the nonparmaetric bootstrap differ by what is pluggedinto the statistical functional.<P>We want to estimate <IMG WIDTH="54" HEIGHT="37" ALIGN="MIDDLE" BORDER="0" SRC="img278.png" ALT="$\lambda_n(F)$"> and we can useas an estimate either <!-- MATH $\lambda_n(F_{\hat{\theta}})$ --><IMG WIDTH="59" HEIGHT="37" ALIGN="MIDDLE" BORDER="0" SRC="img279.png" ALT="$\lambda_n(F_{\hat{\theta}})$">or<!-- MATH $\lambda_n(\hat{F}_n)$ --><IMG WIDTH="60" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img280.png" ALT="$\lambda_n(\hat{F}_n)$">. In fact thereis an intermediary choice, that takes the empirical cdf <IMG WIDTH="26" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img16.png" ALT="$\hat{F}_n$"> and smooths it a little, then weuse the smoothed empirical cdf denoted by <IMG WIDTH="25" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img281.png" ALT="$\hat{F}_h$">and we plug it in.<P>This is especially useful when the bootstrap distribution is toodiscrete, mostly whenthe statistic<IMG WIDTH="14" HEIGHT="22" ALIGN="BOTTOM" BORDER="0" SRC="img212.png" ALT="$\hat{\theta}$"> is a quantile,the median as we saw in the mouse data analysishad that problem.<P><H2><A NAME="SECTION002141000000000000000">Smoothing- a crash course</A></H2><BODY bgcolor="#FFFFFF">Suppose we have a bidimensionalscatterplot we want to smooth,this could be a histogramor a regression type context, they are both of the same form.The simplest one to start one is when the <IMG WIDTH="15" HEIGHT="16" ALIGN="BOTTOM" BORDER="0" SRC="img282.png" ALT="$x$"> abscissa, althou ordinal are discrete,such as ages rounded to decades.Then the y data appear along linesof the possible <IMG WIDTH="15" HEIGHT="16" ALIGN="BOTTOM" BORDER="0" SRC="img282.png" ALT="$x$">'s.<P><IMG SRC="smooth1.gif", width=400><P>The crosses, which are the conditionalaverages are a smooth of the scatter plot is some way.<P>Now suppose that the x's could be all over the place, we window them and take local averages.<P>The extreme case is when you take the whole x axes,then there is only one average, if you want you draw a line through it.<P><IMG SRC="smooth2.gif", width=400><P>When the window is the smallest there is NOsmoothing.<P><IMG SRC="smooth3.gif", width=400><P>Again we want something gentler so we reduce the window width,and only take local averages.If we choose to differentiate within a window the points that areclose tothe abscisse at which we want to estimate the <IMG WIDTH="14" HEIGHT="33" ALIGN="MIDDLE" BORDER="0" SRC="img283.png" ALT="$y$"> valueby averaging, we can use a kernel weighting function.<P>Points that are close are given high weights,points further away are given lighter weights,on the boundary of the window the points won't count.<P>The weighting function is such that the sumof all the weights is 1.With no difference between weights, they are uniform.In fact the weighting function can be a probabilitydensity and often we take a Normal one.<P>Here is a nice <A NAME="tex2html38" HREF="http://www.datatool.com/prod01.htm">webpage</A>on smoothing, with available matlab softare.<P><DIV ALIGN="CENTER"><B>Curve Fitting Example, Efron & Tibshirani, 7.3</B></DIV><P><TABLE CELLPADDING=3><TR><TD ALIGN="CENTER"><IMG SRC="loesscompli2.jpg", width=300></TD><TD ALIGN="CENTER"><IMG SRC="bootcompli.jpg", width=300></TD></TR><TR><TD ALIGN="CENTER"><IMG SRC="bootcompli2.jpg", width=300></TD><TD ALIGN="CENTER"><IMG SRC="loesscomplici.jpg", width=300></TD></TR></TABLE><P>loess.m is available in the course directory &loess is a built-in function in Splus.<BR><P>Matlab procedure for bootstrapping the loess curve.<PRE>#N is the number of bootstrap.N=500;predmat=zeros(N, 101);datasize=size(cholo,1);clf;plot(cholo(:,1), cholo(:,2), '.'); hold on;for i=1:N xind=unidrnd(datasize, datasize,1); x=cholo(xind,:); predmat(i,:)=loess(x(:,1), x(:,2), (0:100), .3, 1); plot((0:100), predmat(i,:), '-.'); #Plot a sample bootstrap curve. end;#Plot the 95\% pointwise confidence lines.plot((0:100), prctile(predmat, 2.5), 'r-');plot((0:100), prctile(predmat, 97.5), 'r-');xlabel('Compliance');ylabel('Improvement');axis([-5, 105, -40, 120]);</PRE><P><H2><A NAME="SECTION002142000000000000000">Smoothing for variance stabilization</A></H2>Page 164-166Algorithm:<OL><LI>Generate <IMG WIDTH="26" HEIGHT="35" ALIGN="MIDDLE" BORDER="0" SRC="img284.png" ALT="$B_1$"> bootstrap samples <!-- MATH $\mbox{${\cal X}$}_b^*$ --><IMG WIDTH="29" HEIGHT="37" ALIGN="MIDDLE" BORDER="0" SRC="img285.png" ALT="$\mbox{${\cal X}$}_b^*$">and the bootstrap estimates <!-- MATH $\{\hat{\theta}_b^*,b=1:B_1\}$ --><IMG WIDTH="129" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img286.png" ALT="$\{\hat{\theta}_b^*,b=1:B_1\}$">. <UL><LI>For each b, take <IMG WIDTH="26" HEIGHT="35" ALIGN="MIDDLE" BORDER="0" SRC="img287.png" ALT="$B_2$"> bootstrap resamples and estimate<!-- MATH $\hat{se}(\hat{\theta}_b^*)$ --><IMG WIDTH="53" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img288.png" ALT="$\hat{se}(\hat{\theta}_b^*)$"> the standard error.</LI></UL></LI><LI>Fit a smooth curve tothe pairs <!-- MATH $(\hat{\theta}_b^*,\hat{se}(\hat{\theta}_b^*)$ --><IMG WIDTH="86" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img289.png" ALT="$(\hat{\theta}_b^*,\hat{se}(\hat{\theta}_b^*)$">to produce a smooth estimate of the function,we will call it <!-- MATH $s(u)=se(\hat{\theta}|\theta=u)$ --><IMG WIDTH="155" HEIGHT="45" ALIGN="MIDDLE" BORDER="0" SRC="img290.png" ALT="$s(u)=se(\hat{\theta}\vert\theta=u)$">.</LI><LI>Use <!-- MATH $g(x)=\int^x \frac{1}{s(u)}du$ --><IMG WIDTH="136" HEIGHT="40" ALIGN="MIDDLE" BORDER="0" SRC="img291.png" ALT="$g(x)=\int^x \frac{1}{s(u)}du$"> as the variancestabilizing transformation. Find <IMG WIDTH="14" HEIGHT="33" ALIGN="MIDDLE" BORDER="0" SRC="img180.png" ALT="$g$"> through numericalintegration usually.</LI><LI>Compute with <IMG WIDTH="26" HEIGHT="35" ALIGN="MIDDLE" BORDER="0" SRC="img292.png" ALT="$B_3$"> bootstrap resamples,a bootstrap t interval for <!-- MATH $\phi=g(\theta)$ --><IMG WIDTH="74" HEIGHT="37" ALIGN="MIDDLE" BORDER="0" SRC="img293.png" ALT="$\phi=g(\theta)$">.(SE approximately one, so no denominator).</LI><LI>Map back the endpoints of the intervalthrough a <IMG WIDTH="43" HEIGHT="42" ALIGN="MIDDLE" BORDER="0" SRC="img294.png" ALT="$g^{(-1)}$"> transformation.</LI></OL><P><PRE>boott package:bootstrap R DocumentationBootstrap-t Confidence LimitsDescription: See Efron and Tibshirani (1993) for details on this function.Usage: boott(x,theta, ..., sdfun=MISSING, nbootsd=25, nboott=200, VS=FALSE, v.nbootg=100, v.nbootsd=25, v.nboott=200, perc=c(.001,.01,.025,.05,.10,.50,.90,.95,.975,.99,.999))Arguments: x: a vector containing the data. Nonparametric bootstrap sampling is used. To bootstrap from more complex data structures (e.g. bivariate data) see the last example below. theta: function to be bootstrapped. Takes 'x' as an argument, and may take additional arguments (see below and last example). ...: any additional arguments to be passed to 'theta' sdfun: optional name of function for computing standard deviation of 'theta' based on data 'x'. Should be of the form: 'sdmean <- function(x,nbootsd,theta,...)' where 'nbootsd' is a dummy argument that is not used. If 'theta' is the mean, for example, 'sdmean <- function(x,nbootsd,theta,...) {sqrt(var(x)/length(x))}' . If 'sdfun' is missing, then 'boott' uses an inner bootstrap loop to estimate the standard deviation of 'theta(x)' nbootsd: The number of bootstrap samples used to estimate the standard deviation of 'theta(x)' nboott: The number of bootstrap samples used to estimate the distribution of the bootstrap T statistic. 200 is a bare minimum and 1000 or more is needed for reliable alpha % confidence points, alpha > .95 say. Total number of bootstrap samples is 'nboott*nbootsd'. VS: If 'TRUE', a variance stabilizing transformation is estimated, and the interval is constructed on the transformed scale, and then is mapped back to the original theta scale. This can improve both the statistical properties of the intervals and speed up the computation. See the reference Tibshirani (1988) given below. If 'FALSE', variance stabilization is not performed.v.nbootg: The number of bootstrap samples used to estimate the variance stabilizing transformation g. Only used if 'VS=TRUE'.v.nbootsd: The number of bootstrap samples used to estimate the standard deviation of 'theta(x)'. Only used if 'VS=TRUE'.v.nboott: The number of bootstrap samples used to estimate the distribution of the bootstrap T statistic. Only used if 'VS=TRUE'. Total number of bootstrap samples is 'v.nbootg*v.nbootsd + v.nboott'. perc: Confidence points desired.Value: list with the following components: confpoints: Estimated confidence pointstheta, g: 'theta' and 'g' are only returned if 'VS=TRUE' was specified. '(theta[i],g[i]), i=1,length(theta)' represents the estimate of the variance stabilizing transformation 'g' at the points 'theta[i]'.References: Tibshirani, R. (1988) "Variance stabilization and the bootstrap". Biometrika (1988) vol 75 no 3 pages 433-44. Hall, P. (1988) Theoretical comparison of bootstrap confidence intervals. Ann. Statisi. 16, 1-50. Efron, B. and Tibshirani, R. (1993) An Introduction to the Bootstrap. Chapman and Hall, New York, London.Examples: # estimated confidence points for the mean x <- rchisq(20,1) theta <- function(x){mean(x)} results <- boott(x,theta) # estimated confidence points for the mean, # using variance-stabilization bootstrap-T method results <- boott(x,theta,VS=TRUE) results$confpoints # gives confidence points # plot the estimated var stabilizing transformation plot(results$theta,results$g) # use standard formula for stand dev of mean # rather than an inner bootstrap loop sdmean <- function(x, ...) {sqrt(var(x)/length(x))} results <- boott(x,theta,sdfun=sdmean) # To bootstrap functions of more complex data structures, # write theta so that its argument x # is the set of observation numbers # and simply pass as data to boot the vector 1,2,..n. # For example, to bootstrap # the correlation coefficient from a set of 15 data pairs: xdata <- matrix(rnorm(30),ncol=2) n <- 15 theta <- function(x, xdata){ cor(xdata[x,1],xdata[x,2]) } results <- boott(1:n,theta, xdata)</PRE><P><HR><!--Navigation Panel--><IMG WIDTH="81" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next_inactive" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/nx_grp_g.png"> <A NAME="tex2html422" HREF="node6.html"><IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/up.png"></A> <A NAME="tex2html420" HREF="node19.html"><IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="file:/home/depot/swtree/depot/latex2html-2002-2/latex2html-2002-2/icons/prev.png"></A> <BR><B> Up:</B> <A NAME="tex2html423" HREF="node6.html">Lectures</A><B> Previous:</B> <A NAME="tex2html421" HREF="node19.html">Confidence Intervals</A><!--End of Navigation Panel--><ADDRESS>Susan Holmes2004-05-19</ADDRESS></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -