⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 students_t_examples.qbk

📁 Boost provides free peer-reviewed portable C++ source libraries. We emphasize libraries that work
💻 QBK
📖 第 1 页 / 共 3 页
字号:
[section:st_eg Student's t Distribution Examples][section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]Let's say you have a sample mean, you may wish to know what confidence intervalsyou can place on that mean.  Colloquially: "I want an interval that I can beP% sure contains the true mean".  (On a technical point, note that the interval either contains the true mean or it does not: the meaning of the confidence level is subtly different from this colloquialism.  More background information can be found on the [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]).The formula for the interval can be expressed as:[equation dist_tutorial4]Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation, /N/ is the sample size, /[alpha]/ is the desired significance level and ['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-tdistribution with /N-1/ degrees of freedom.[noteThe quantity [alpha][space] is the maximum acceptable risk of falsely rejectingthe null-hypothesis.  The smaller the value of [alpha] the greater thestrength of the test.The confidence level of the test is defined as 1 - [alpha], and often expressedas a percentage.  So for example a significance level of 0.05, is equivalentto a 95% confidence level.  Refer to [@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm "What are confidence intervals?"] in __handbook for more information.] [/ Note][noteThe usual assumptions of[@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)]variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution]of course apply here, as they do in other examples.]From the formula, it should be clear that:* The width of the confidence interval decreases as the sample size increases.* The width increases as the standard deviation increases.* The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger).* The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger).The following example code is taken from the example program[@../../../example/students_t_single_sample.cpp students_t_single_sample.cpp].We'll begin by defining a procedure to calculate intervals for various confidence levels; the procedure will print these outas a table:   // Needed includes:   #include <boost/math/distributions/students_t.hpp>   #include <iostream>   #include <iomanip>   // Bring everything into global namespace for ease of use:   using namespace boost::math;   using namespace std;      void confidence_limits_on_mean(      double Sm,           // Sm = Sample Mean.      double Sd,           // Sd = Sample Standard Deviation.      unsigned Sn)         // Sn = Sample Size.   {      using namespace std;      using namespace boost::math;      // Print out general info:      cout <<          "__________________________________\n"         "2-Sided Confidence Limits For Mean\n"         "__________________________________\n\n";      cout << setprecision(7);      cout << setw(40) << left << "Number of Observations" << "=  " << Sn << "\n";      cout << setw(40) << left << "Mean" << "=  " << Sm << "\n";      cout << setw(40) << left << "Standard Deviation" << "=  " << Sd << "\n";We'll define a table of significance/risk levels for which we'll compute intervals:      double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };      Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999).Next we'll declare the distribution object we'll need, note thatthe /degrees of freedom/ parameter is the sample size less one:      students_t dist(Sn - 1);Most of what follows in the program is pretty printing, so let's focuson the calculation of the interval. First we need the t-statistic,computed using the /quantile/ function and our significance level.  Notethat since the significance levels are the complement of the probability,we have to wrap the arguments in a call to /complement(...)/:   double T = quantile(complement(dist, alpha[i] / 2));   Note that alpha was divided by two, since we'll be calculatingboth the upper and lower bounds: had we been interested in a singlesided interval then we would have omitted this step.   Now to complete the picture, we'll get the (one-sided) width of theinterval from the t-statisticby multiplying by the standard deviation, and dividing by the squareroot of the sample size:   double w = T * Sd / sqrt(double(Sn));The two-sided interval is then the sample mean plus and minus this width.And apart from some more pretty-printing that completes the procedure.Let's take a look at some sample output, first using the[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htmHeat flow data] from the NIST site.  The data set was collected by Bob Zarr of NIST in January, 1990 from a heat flow meter calibration and stability analysis.The corresponding dataplotoutput for this test can be found in [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm section 3.5.2] of the __handbook.[pre'''   __________________________________   2-Sided Confidence Limits For Mean   __________________________________   Number of Observations                  =  195   Mean                                    =  9.26146   Standard Deviation                      =  0.02278881   ___________________________________________________________________   Confidence       T           Interval          Lower          Upper    Value (%)     Value          Width            Limit          Limit   ___________________________________________________________________       50.000     0.676       1.103e-003        9.26036        9.26256       75.000     1.154       1.883e-003        9.25958        9.26334       90.000     1.653       2.697e-003        9.25876        9.26416       95.000     1.972       3.219e-003        9.25824        9.26468       99.000     2.601       4.245e-003        9.25721        9.26571       99.900     3.341       5.453e-003        9.25601        9.26691       99.990     3.973       6.484e-003        9.25498        9.26794       99.999     4.537       7.404e-003        9.25406        9.26886''']    As you can see the large sample size (195) and small standard deviation (0.023)have combined to give very small intervals, indeed we can bevery confident that the true mean is 9.2.For comparison the next example data output is taken from['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]The values result from the determination of mercury by cold-vapour atomic absorption.[pre'''   __________________________________   2-Sided Confidence Limits For Mean   __________________________________   Number of Observations                  =  3   Mean                                    =  37.8000000   Standard Deviation                      =  0.9643650   ___________________________________________________________________   Confidence       T           Interval          Lower          Upper    Value (%)     Value          Width            Limit          Limit   ___________________________________________________________________       50.000     0.816            0.455       37.34539       38.25461       75.000     1.604            0.893       36.90717       38.69283       90.000     2.920            1.626       36.17422       39.42578       95.000     4.303            2.396       35.40438       40.19562       99.000     9.925            5.526       32.27408       43.32592       99.900    31.599           17.594       20.20639       55.39361       99.990    99.992           55.673      -17.87346       93.47346       99.999   316.225          176.067     -138.26683      213.86683''']This time the fact that there are only three measurements leads tomuch wider intervals, indeed such large intervals that it's hardto be very confident in the location of the mean.[endsect][section:tut_mean_test Testing a sample mean for difference from a "true" mean]When calibrating or comparing a scientific instrument or measurement method of some kind, we want to be answer the question "Does an observed sample mean differ from the"true" mean in any significant way?".  If it does, then we have evidence ofa systematic difference.  This question can be answered with a Students-t test:more information can be found [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm on the NIST site].Of course, the assignment of "true" to one mean may be quite arbitrary,often this is simply a "traditional" method of measurement.The following example code is taken from the example program[@../../../example/students_t_single_sample.cpp students_t_single_sample.cpp].We'll begin by defining a procedure to determine which of thepossible hypothesis are rejected or not-rejectedat a given significance level:[noteNon-statisticians might say 'not-rejected' means 'accepted',(often of the null-hypothesis) implying, wrongly, that there really *IS* no difference,but statisticans eschew this to avoid implying that there is positive evidence of 'no difference'.'Not-rejected' here means there is *no evidence* of difference, but there still might well be a difference.For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and [@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.]] [/ note]   // Needed includes:   #include <boost/math/distributions/students_t.hpp>   #include <iostream>   #include <iomanip>   // Bring everything into global namespace for ease of use:   using namespace boost::math;   using namespace std;      void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha)   {      //      // M = true mean.      // Sm = Sample Mean.      // Sd = Sample Standard Deviation.      // Sn = Sample Size.      // alpha = Significance Level.      Most of the procedure is pretty-printing, so let's just focus on thecalculation, we begin by calculating the t-statistic:   // Difference in means:   double diff = Sm - M;   // Degrees of freedom:   unsigned v = Sn - 1;   // t-statistic:   double t_stat = diff * sqrt(double(Sn)) / Sd;Finally calculate the probability from the t-statistic. If we're interested in simply whether there is a difference (either less or greater) or not,we don't care about the sign of the t-statistic,and we take the complement of the probability for comparisonto the significance level:   students_t dist(v);   double q = cdf(complement(dist, fabs(t_stat)));   The procedure then prints out the results of the various teststhat can be done, these can be summarised in the following table:[table

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -