📄 students_t_examples.qbk

📁 Boost provides free peer-reviewed portable C++ source libraries. We emphasize libraries that work
💻 QBK
📖 第 1 页 / 共 3 页
字号:
[[Hypothesis][Test]][[The Null-hypothesis: there is*no difference* in means]  [Reject if complement of CDF for |t| < significance level / 2:`cdf(complement(dist, fabs(t))) < alpha / 2`]][[The Alternative-hypothesis: there*is difference* in means]  [Reject if complement of CDF for |t| > significance level / 2:`cdf(complement(dist, fabs(t))) > alpha / 2`]][[The Alternative-hypothesis: the sample mean *is less* thanthe true mean.]  [Reject if CDF of t > significance level:`cdf(dist, t) > alpha`]][[The Alternative-hypothesis: the sample mean *is greater* thanthe true mean.]  [Reject if complement of CDF of t > significance level:`cdf(complement(dist, t)) > alpha`]]][noteNotice that the comparisons are against `alpha / 2` for a two-sided testand against `alpha` for a one-sided test]Now that we have all the parts in place, let's take a look at some sample output, first using the[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htmHeat flow data] from the NIST site.  The data set was collected by Bob Zarr of NIST in January, 1990 from a heat flow meter calibration and stability analysis.  The corresponding dataplotoutput for this test can be found in [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm section 3.5.2] of the __handbook.[pre'''   __________________________________   Student t test for a single sample   __________________________________   Number of Observations                                 =  195   Sample Mean                                            =  9.26146   Sample Standard Deviation                              =  0.02279   Expected True Mean                                     =  5.00000   Sample Mean - Expected Test Mean                       =  4.26146   Degrees of Freedom                                     =  194   T Statistic                                            =  2611.28380   Probability that difference is due to chance           =  0.000e+000   Results for Alternative Hypothesis and alpha           =  0.0500'''   Alternative Hypothesis     Conclusion   Mean != 5.000              NOT REJECTED   Mean  < 5.000              REJECTED   Mean  > 5.000              NOT REJECTED]You will note the line that says the probability that the difference isdue to chance is zero.  From a philosophical point of view, of course,the probability can never reach zero.  However, in this case the calculatedprobability is smaller than the smallest representable double precision number,hence the appearance of a zero here.  Whatever its "true" value is, we know itmust be extraordinarily small, so the alternative hypothesis - that there isa difference in means - is not rejected.For comparison the next example data output is taken from['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]The values result from the determination of mercury by cold-vapour atomic absorption.[pre'''   __________________________________   Student t test for a single sample   __________________________________   Number of Observations                                 =  3   Sample Mean                                            =  37.80000   Sample Standard Deviation                              =  0.96437   Expected True Mean                                     =  38.90000   Sample Mean - Expected Test Mean                       =  -1.10000   Degrees of Freedom                                     =  2   T Statistic                                            =  -1.97566   Probability that difference is due to chance           =  1.869e-001   Results for Alternative Hypothesis and alpha           =  0.0500'''   Alternative Hypothesis     Conclusion   Mean != 38.900             REJECTED   Mean  < 38.900             REJECTED   Mean  > 38.900             REJECTED]As you can see the small number of measurements (3) has led to a large uncertaintyin the location of the true mean.  So even though there appears to be a differencebetween the sample mean and the expected true mean, we conclude that thereis no significant difference, and are unable to reject the null hypothesis.  However, if we were to lower the bar for acceptance down to alpha = 0.1 (a 90% confidence level) we see a different output:[pre'''__________________________________Student t test for a single sample__________________________________Number of Observations                                 =  3Sample Mean                                            =  37.80000Sample Standard Deviation                              =  0.96437Expected True Mean                                     =  38.90000Sample Mean - Expected Test Mean                       =  -1.10000Degrees of Freedom                                     =  2T Statistic                                            =  -1.97566Probability that difference is due to chance           =  1.869e-001Results for Alternative Hypothesis and alpha           =  0.1000'''Alternative Hypothesis     ConclusionMean != 38.900            REJECTEDMean  < 38.900            NOT REJECTEDMean  > 38.900            REJECTED]In this case, we really have a borderline result,and more data (and/or more accurate data),is needed for a more convincing conclusion.[endsect][section:tut_mean_size Estimating how large a sample size would have to becomein order to give a significant Students-t test result with a single sample test]Imagine you have conducted a Students-t test on a single sample in orderto check for systematic errors in your measurements.  Imagine that theresult is borderline.  At this point one might go off and collect more data,but it might be prudent to first ask the question "How much more?".The parameter estimators of the students_t_distribution classcan provide this information.This section is based on the example code in [@../../../example/students_t_single_sample.cpp students_t_single_sample.cpp]and we begin by defining a procedure that will print out a table ofestimated sample sizes for various confidence levels:   // Needed includes:   #include <boost/math/distributions/students_t.hpp>   #include <iostream>   #include <iomanip>   // Bring everything into global namespace for ease of use:   using namespace boost::math;   using namespace std;      void single_sample_find_df(      double M,          // M = true mean.      double Sm,         // Sm = Sample Mean.      double Sd)         // Sd = Sample Standard Deviation.   {Next we define a table of significance levels:      double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };Printing out the table of sample sizes required for various confidence levelsbegins with the table header:      cout << "\n\n"              "_______________________________________________________________\n"              "Confidence       Estimated          Estimated\n"              " Value (%)      Sample Size        Sample Size\n"              "              (one sided test)    (two sided test)\n"              "_______________________________________________________________\n";And now the important part: the sample sizes required.  Class`students_t_distribution` has a static member function`find_degrees_of_freedom` that will calculate how largea sample size needs to be in order to give a definitive result.The first argument is the difference between the means that youwish to be able to detect, here it's the absolute value of thedifference between the sample mean, and the true mean.  Then come two probability values: alpha and beta.  Alpha is themaximum acceptable risk of rejecting the null-hypothesis when it isin fact true.  Beta is the maximum acceptable risk of failing to rejectthe null-hypothesis when in fact it is false.Also note that for a two-sided test, alpha must be divided by 2.  The final parameter of the function is the standard deviation of the sample.In this example, we assume that alpha and beta are the same, and call`find_degrees_of_freedom` twice: once with alpha for a one-sided test,and once with alpha/2 for a two-sided test.      for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)      {         // Confidence value:         cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);         // calculate df for single sided test:         double df = students_t::find_degrees_of_freedom(            fabs(M - Sm), alpha[i], alpha[i], Sd);         // convert to sample size:         double size = ceil(df) + 1;         // Print size:         cout << fixed << setprecision(0) << setw(16) << right << size;         // calculate df for two sided test:         df = students_t::find_degrees_of_freedom(            fabs(M - Sm), alpha[i]/2, alpha[i], Sd);         // convert to sample size:         size = ceil(df) + 1;         // Print size:         cout << fixed << setprecision(0) << setw(16) << right << size << endl;      }      cout << endl;   }Let's now look at some sample output using data taken from['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]The values result from the determination of mercury by cold-vapour atomic absorption.  Only three measurements were made, and the Students-t test abovegave a borderline result, so this examplewill show us how many samples would need to be collected:[pre'''_____________________________________________________________Estimated sample sizes required for various confidence levels_____________________________________________________________True Mean                               =  38.90000Sample Mean                             =  37.80000Sample Standard Deviation               =  0.96437_______________________________________________________________Confidence       Estimated          Estimated Value (%)      Sample Size        Sample Size              (one sided test)    (two sided test)_______________________________________________________________    50.000               2               3    75.000               4               5    90.000               8              10    95.000              12              14    99.000              21              23    99.900              36              38    99.990              51              54    99.999              67              69''']So in this case, many more measurements would have had to be made,
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -