📄 tripleexponentialsmoothingmodel.java
字号:
//// OpenForecast - open source, general-purpose forecasting package.// Copyright (C) 2004 Steven R. Gould//// This library is free software; you can redistribute it and/or// modify it under the terms of the GNU Lesser General Public// License as published by the Free Software Foundation; either// version 2.1 of the License, or (at your option) any later version.//// This library is distributed in the hope that it will be useful,// but WITHOUT ANY WARRANTY; without even the implied warranty of// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU// Lesser General Public License for more details.//// You should have received a copy of the GNU Lesser General Public// License along with this library; if not, write to the Free Software// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA//package net.sourceforge.openforecast.models;import java.util.ArrayList;import java.util.Iterator;import net.sourceforge.openforecast.DataPoint;import net.sourceforge.openforecast.DataSet;import net.sourceforge.openforecast.ForecastingModel;import net.sourceforge.openforecast.Observation;/** * Triple exponential smoothing - also known as the Winters method - is a * refinement of the popular double exponential smoothing model but adds * another component which takes into account any seasonality - or periodicity * - in the data. * * <p>Simple exponential smoothing models work best with data where there are * no trend or seasonality components to the data. When the data exhibits * either an increasing or decreasing trend over time, simple exponential * smoothing forecasts tend to lag behind observations. Double exponential * smoothing is designed to address this type of data series by taking into * account any trend in the data. However, neither of these exponential * smoothing models address any seasonality in the data. * * <p>For better exponentially smoothed forecasts of data where there is * expected or known to be seasonal variation in the data, use triple * exponential smoothing. * * <p>As with simple exponential smoothing, in triple exponential smoothing * models past observations are given exponentially smaller weights as the * observations get older. In other words, recent observations are given * relatively more weight in forecasting than the older observations. This is * true for all terms involved - namely, the base level * <code>L<sub>t</sub></code>, the trend <code>T<sub>t</sub></code> as well as * the seasonality index <code>s<sub>t</sub></code>. * * <p>There are four equations associated with Triple Exponential Smoothing. * * <ul> * <li><code>L<sub>t</sub> = a.(x<sub>t</sub>/s<sub>t-c</sub>)+(1-a).(L<sub>t-1</sub>+T<sub>t-1</sub>)</code></li> * <li><code>T<sub>t</sub> = b.(L<sub>t</sub>-L<sub>t-1</sub>)+(1-b).T<sub>t-1</sub></code></li> * <li><code>s<sub>t</sub> = g.(x<sub>t</sub>/L<sub>t</sub>)+(1-g).s<sub>t-c</sub></code></li> * <li><code>f<sub>t,k</sub> = (L<sub>t</sub>+k.T<sub>t</sub>).s<sub>t+k-c</sub></code></li> * </ul> * * <p>where: * <ul> * <li><code>L<sub>t</sub></code> is the estimate of the base value at time * <code>t</code>. That is, the estimate for time <code>t</code> after * eliminating the effects of seasonality and trend.</li> * <li><code>a</code> - representing alpha - is the first smoothing * constant, used to smooth <code>L<sub>t</sub></code>.</li> * <li><code>x<sub>t</sub></code> is the observed value at time t.</li> * <li><code>s<sub>t</sub></code> is the seasonal index at time t.</li> * <li><code>c</code> is the number of periods in the seasonal pattern. For * example, c=4 for quarterly data, or c=12 for monthly data.</li> * <li><code>T<sub>t</sub></code> is the estimated trend at time t.</li> * <li><code>b</code> - representing beta - is the second smoothing * constant, used to smooth the trend estimates.</li> * <li><code>g</code> - representing gamma - is the third smoothing constant, * used to smooth the seasonality estimates.</li> * <li><code>f<sub>t,k</sub></code> is the forecast at time the end of period * <code>t</code> for the period <code>t+k</code>.</li> * </ul> * * <p>There are a variety of different ways to come up with initial values for * the triple exponential smoothing model. The approach implemented here uses * the first two "years" (or complete cycles) of data to come up with initial * values for <code>L<sub>t</sub></code>, <code>T<sub>t</sub></code> and * <code>s<sub>t</sub></code>. Therefore, at least two complete cycles of data * are required to initialize the model. For best results, more data is * recommended - ideally a minimum of 4 or 5 complete cycles. This gives the * model chance to better adapt to the data, instead of relying on getting * - guessing - good estimates for the initial conditions. * * <h2>Choosing values for the smoothing constants</h2> * <p>The smoothing constants <code>a</code>, <code>b</code>, and * <code>g</code> each must be a value in the range 0.0-1.0. But, what are the * "best" values to use for the smoothing constants? This depends on the data * series being modeled. * * <p>In general, the speed at which the older responses are dampened * (smoothed) is a function of the value of the smoothing constant. When this * smoothing constant is close to 1.0, dampening is quick - more weight is * given to recent observations - and when it is close to 0.0, dampening is * slow - and relatively less weight is given to recent observations. * * <p>The best value for the smoothing constant is the one that results in the * smallest mean of the squared errors (or other similar accuracy indicator). * The {@link #getBestFitModel} static methods can help with the selection of * the best values for the smoothing constants, though the results obtained * from these methods should always be validated. If any of the "best fit" * smoothing constants turns out to be 1.0, you may want to be a little * suspicious. This may be an indication that you really need to use more data * to initialize the model. * @author Steven R. Gould * @since 0.4 * @see <a href="http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc435.htm">Engineering Statistics Handbook, 6.4.3.5 Triple Exponential Smoothing</a> */public class TripleExponentialSmoothingModel extends AbstractTimeBasedModel{ /** * The default value of the tolerance permitted in the estimates of the * smoothing constants in the {@link #getBestFitModel} methods. */ private static double DEFAULT_SMOOTHING_CONSTANT_TOLERANCE = 0.001; /** * Minimum number of years of data required. */ private static int NUMBER_OF_YEARS = 2; /** * The overall smoothing constant, alpha, used in this exponential * smoothing model. */ private double alpha; /** * The second smoothing constant, beta, used in this exponential * smoothing model for trend smoothing. */ private double beta; /** * The third smoothing constant (gamma) used in this exponential smoothing * model for the seasonal smoothing. */ private double gamma; /** * Stores the number of periods per year in this exponential smoothing * model. Note that, in spite of the name, this does not limit the * functionality of this model to seasonality within a year. It is quite * possible that the "seasonality" - or periodicity - of interest is the * variability by day within a week, or any other period. */ private int periodsPerYear = 0; /** * Stores the maximum observed time. Initialized in {@link #init}. */ private double maxObservedTime; /** * Provides a cache of calculated baseValues. The "base" represents an * estimate of the underlying value of the data after accounting for * - i.e. removing - the effects of trend and seasonality. Since these * values are used very frequently when calculating forecast values, it * is more efficient to cache the previously calculated base values for * future use. */ private DataSet baseValues; /** * Provides a cache of calculated trendValues. Since these values are * used very frequently when calculating forecast values, it is more * efficient to cache the previously calculated trend values for future * use. */ private DataSet trendValues; /** * Provides a cache of calculated seasonal indexes. Since these values * are used very frequently when calculating forecast values, it is more * efficient to cache the previously calculated seasonal index values * for future use than have to recalculate them each time. */ private DataSet seasonalIndex; /** * Factory method that returns a "best fit" triple exponential smoothing * model for the given data set. This, like the overloaded * {@link #getBestFitModel(DataSet,double,double)}, attempts to derive * "good" - hopefully near optimal - values for the alpha and beta * smoothing constants. * @param dataSet the observations for which a "best fit" triple * exponential smoothing model is required. * @return a best fit triple exponential smoothing model for the given * data set. * @see #getBestFitModel(DataSet,double,double) */ public static TripleExponentialSmoothingModel getBestFitModel( DataSet dataSet ) { return getBestFitModel( dataSet, DEFAULT_SMOOTHING_CONSTANT_TOLERANCE, DEFAULT_SMOOTHING_CONSTANT_TOLERANCE ); } /** * Factory method that returns a best fit triple exponential smoothing * model for the given data set. This, like the overloaded * {@link #getBestFitModel(DataSet)}, attempts to derive "good" - * hopefully near optimal - values for the alpha and beta smoothing * constants. * * <p>To determine which model is "best", this method currently uses only * the Mean Squared Error (MSE). Future versions may use other measures in * addition to the MSE. However, the resulting "best fit" model - and the * associated values of alpha and beta - is expected to be very similar * either way. * * <p>Note that the approach used to calculate the best smoothing * constants - alpha and beta - <em>may</em> end up choosing values near * a local optimum. In other words, there <em>may</em> be other values for * alpha and beta that result in an even better model. * @param dataSet the observations for which a "best fit" triple * exponential smoothing model is required. * @param alphaTolerance the required precision/accuracy - or tolerance * of error - required in the estimate of the alpha smoothing constant. * @param betaTolerance the required precision/accuracy - or tolerance * of error - required in the estimate of the beta smoothing constant. * @return a best fit triple exponential smoothing model for the given * data set. */ public static TripleExponentialSmoothingModel getBestFitModel( DataSet dataSet, double alphaTolerance, double betaTolerance ) { TripleExponentialSmoothingModel model1 = findBestBeta( dataSet, 0.0, 0.0, 1.0, betaTolerance ); TripleExponentialSmoothingModel model2 = findBestBeta( dataSet, 0.5, 0.0, 1.0, betaTolerance ); TripleExponentialSmoothingModel model3 = findBestBeta( dataSet, 1.0, 0.0, 1.0, betaTolerance ); // First rough estimate of alpha and beta to the nearest 0.1 TripleExponentialSmoothingModel bestModel = findBest( dataSet, model1, model2, model3, alphaTolerance, betaTolerance ); return bestModel; } /** * Performs a non-linear - yet somewhat intelligent - search for the best * values for the smoothing coefficients alpha and beta for the given * data set. * * <p>For the given data set, and models with a small, medium and large * value of the alpha smoothing constant, returns the best fit model where * the value of the alpha and beta (trend) smoothing constants are within * the given tolerances. * * <p>Note that the descriptions of the parameters below include a * discussion of valid values. However, since this is a private method and * to help improve performance, we don't provide any validation of these * parameters. Using invalid values may lead to unexpected results. * @param dataSet the data set for which a best fit model is required. * @param modelMin the pre-initialized best fit model with the smallest * value of the alpha smoothing constant found so far. * @param modelMid the pre-initialized best fit model with the value of * the alpha smoothing constant between that of modelMin and modelMax. * @param modelMax the pre-initialized best fit model with the largest * value of the alpha smoothing constant found so far. * @param alphaTolerance the tolerance within which the alpha value is * required. Must be considerably less than 1.0. However, note that the * smaller this value the longer it will take to diverge on a best fit * model. * @param betaTolerance the tolerance within which the beta value is * required. Must be considerably less than 1.0. However, note that the * smaller this value the longer it will take to diverge on a best fit * model. This value can be the same as, greater than or less than the * value of the alphaTolerance parameter. It makes no difference - at * least to this code. */ private static TripleExponentialSmoothingModel findBest( DataSet dataSet, TripleExponentialSmoothingModel modelMin, TripleExponentialSmoothingModel modelMid, TripleExponentialSmoothingModel modelMax, double alphaTolerance, double betaTolerance) { double alphaMin = modelMin.getAlpha(); double alphaMid = modelMid.getAlpha(); double alphaMax = modelMax.getAlpha(); // If we're not making much ground, then we're done if (Math.abs(alphaMid-alphaMin)<alphaTolerance && Math.abs(alphaMax-alphaMid)<alphaTolerance ) return modelMid; TripleExponentialSmoothingModel model[] = new TripleExponentialSmoothingModel[5]; model[0] = modelMin; model[1] = findBestBeta( dataSet, (alphaMin+alphaMid)/2.0, 0.0, 1.0, betaTolerance ); model[2] = modelMid; model[3] = findBestBeta( dataSet, (alphaMid+alphaMax)/2.0, 0.0, 1.0, betaTolerance ); model[4] = modelMax; for ( int m=0; m<5; m++ ) model[m].init(dataSet);
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -