⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sparse-approx.html

📁 高斯过程在回归和分类问题中的应用
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html>  <head>    <title>Approximation Methods for Gaussian Process Regression</title>    <link type="text/css" rel="stylesheet" href="style.css">  </head>  <body><h2>Approximation Methods for Gaussian Process Regression</h2>The theory for approximation methods for GPR when faced withlarge datasets is given in section 8.3. We provide implementationsof three methods:<ul><li> subset of datapoints (SD),</li><li> subset of regressors (SR),</li><li> projected process (PP).</li></ul>The SD method is implemented by simply calling the function<a href="regression.html#desc">gpr.m</a> with a subset of the training data. The SR and PP methods are implemented by the function <a href="../gpml/gprSRPP.m">gprSRPP.m</a>. In the example the subsets are selected randomly, although they can also be selected by greedy algorithms, as discussed in section 8.3.</p><!--Code for the Bayesian committee machine (BCM) method is given at FIX!!. -->This page contains 2 topics<ol><li> <a href="#desc">Description</a> of the function gprSRPP.m</li><li> <a href="#boston">A demonstration</a> using the gprSRPP.m function</li></ol><h3 is="desc"> 1. Description of the function gprSRPP.m</h3><p>The function gprSRPP.m is based on the function <ahref="regression.html#desc">gpr.m</a>.  The specification of the covariancefunction is the same as in gpr.m, except that the covariance function isassumed to be specified as a cell array with the first entry being<tt>covSum</tt> and the last entry being <tt>covNoise</tt>.</p><pre>  [mu, S2SR, S2PP] = gprSRPP(logtheta, covfunc, x, INDEX, y, xstar)</pre><table border=0 cols=2 width="100%"><tr><td><b>inputs</b></td><td></td></tr><tr><td width="10%"><tt>logtheta</tt></td><td>a (column) vector containing the logarithm of the hyperparameters of the covariancefunction</td></tr><tr><td><tt>covfunc</tt></td><td>the covariance function, which is assumed tobe a covSum, and the last entry of the sum is covNoise</td></tr><tr><td><tt>x</tt></td><td>a <tt>n</tt> by <tt>D</tt> matrix of traininginputs</td></tr><tr><td><tt>INDEX</tt></td><td>a row vector of length <tt>m</tt> (with m<= n) used to specify which inputs are used in the active set <tr><td><tt>y</tt></td><td>a (column) vector of training settargets (of length <tt>n</tt>)</td></tr><tr><td><tt>xstar</tt></td><td>a <tt>nstar</tt> by <tt>D</tt> matrix of testinputs</td></tr><tr><td><b>outputs</b></td><td></td></tr><tr><td><tt>mu</tt></td><td>(column) vector (of length <tt>nstar</tt>) ofpredictive means</td></tr><tr><td><tt>S2SR</tt></td><td>(column) vector (of length <tt>nstar</tt>) ofpredictive variances from the SR algorithm (incl noise variance)</td></tr><tr><td><tt>S2PP</tt></td><td>(column) vector (of length <tt>nstar</tt>) ofpredictive variances from the PP algorithm (incl noise variance)</td></tr></table><br></p>The SR method is implemented by equations 8.14 (for mu) and 8.15 forS2SR (but also adding on the noise variance). The PP method has the same predictive mean, but a different predictivevariance (S2PP) given by equation 8.27 (but also adding on the noise variance).<h3 id="boston"> 2. A demonstration using the gprSRPP.m function</h3>We use the Boston housing data of D. Harrison and D. L. Rubinfeld, "Hedonichousing prices and the demand for clean air", Journal of EnvironmentalEconomics and Management 5, 81-102 (1978).  This dataset has 13 input variablesand one output target. A split of 455 training points and 51 test points isused.  We use Gaussian process regression with a squared exponential covariancefunction, and allow a separate lengthscale for each input dimension, as ineqs. 5.1 and 5.2.</p>Run the script <a href="../gpml-demo/demo_gprsparse.m">demo_gprsparse.m</a>to produce the results shown below.</p>The training and test data is contained in the file <ahref="../gpml-demo/data_boston.mat">data_boston.mat</a>. The data hasbeen scaled so that each variable has approximately zero mean and unitvariance.  Assuming that the current directory is gpml-demo we need toadd the path of the code, and load the data:<pre>  addpath('../gpml');    % add path for the code  load data_boston</pre>As usual the training inputs are stored in <tt>X</tt>, the training targets in <tt>y</tt>, the test inputs in <tt>Xstar</tt>and the test targets in <tt>ystar</tt>.</p>A random subset of the training data points are selectedusing the randperm function. This set is of size <em>m</em>= 200.<pre>  m = 200;  perm = randperm(n);  INDEX = perm(1:m);  Xm = X(INDEX,:);  ym = y(INDEX);</pre>We use a covariance function made up of the sum of a squaredexponential (SE) covariance term with ARD, and independent noise.Thus, the covariance function is specified as:<pre>  covfunc = {'covSum', {'covSEard','covNoise'}};</pre>The hyperparameters are initialized to <tt>logtheta0=[0; 0; ... 0; 0;-1.15]</tt><pre>  logtheta0 = zeros(D+2,1);   logtheta0(D+2) = -1.15; % starting value for log(noise std dev)</pre>Note that the noise standard deviation is set to exp(-1.15) corresponding to anoise variance of 0.1.</p>The hyperparameters are trained by maximizing the approximate marginallikelihood of the SD method given in eq. 8.31. This simply computes themarginal likelihood of the subset of size <em>m</em>.<pre>  [logtheta, fvals, iter] = minimize(logtheta0, 'gpr', covfunc, -100, Xm, ym);</pre>Predictions can now be made using the 3 methods. SD is implemented simply bycalling gpr on the reduced training set,<pre>  [fstarSD S2SD] = gpr(logtheta, Xm, ym, Xstar);</pre>The outputs are the mean predictions <tt>fstarSD</tt> and the predictive(noise-free) variances <tt>S2SD</tt>.  The SR and PP methods are called usingthe function gprSRPP; note that the INDEX vector is passed:<pre>  [fstarSRPP S2SR S2PP] = gprSRPP(logtheta, X, INDEX, y, Xstar); </pre>gprSRPP returns the predictive mean <tt>fstarSRPP</tt> (which is identical forboth methods), and the predictive (noise-free) variances <tt>S2SR</tt> and<tt>S2PP</tt>.  For comparison we also make predictions using gpr.m on the fulltraining dataset, and a dumb predictor that just predicts the mean and varianceof the training data.</p>We compute the residuals, the mean squared error (mse) and the predictive loglikelihood (pll) for all methods.  Note that the predictive variance for ystarincludes the noise variance, as explained on p. 18.  Thus for example we have<pre>  resSR = fstarSRPP-ystar;  mseSR = sum(resSR.^2)/nstar;  pllSR = -0.5*mean(log(2*pi*S2SR)+resSR.^2./S2SR);</pre>The test results are:<pre>  mse_full 0.118924	 pll_full -0.414019  mse_SD   0.179692	 pll_SD   -0.54915  mse_SR   0.138302	 pll_SR   -0.658645  mse_PP   0.138302	 pll_PP   -0.395815  mse_dumb 1.11464	 pll_dumb -1.4769</pre>where mse denotes mean squared error and pll denotes predictive loglikelihood. A higher (less negative) pll is more desirable. Note that the msefor the SR and PP methods is identical as expected.  The SR and PP methodsoutperform SD on mse, and are close to the full mse. On pll, the PP method doesslightly better than the full predictor, followed by the SD and SR methods.</p>You can experiment further by varying the value of the subset size <em>m</em>in the script <a href="../gpml-demo/demo_gprsparse.m">demo_gprsparse.m</a>.</p>Go back to the <a href="http://www.gaussianprocess.org/gpml">web page</a> forGaussian Processes for Machine Learning.    <hr><!-- Created: Mon Nov  7 09:52:03 CET 2005 --><!-- hhmts start -->Last modified: Wed Mar 29 12:19:25 CEST 2006<!-- hhmts end -->  </body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -