📄 pdbart.html

📁 为一个贝叶斯分类的学习算法实现,是基于linux系统下的c++实现
💻 HTML
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>R: Partial Dependence Plots for BART</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" type="text/css" href="../../R.css">
</head><body>

<table width="100%" summary="page for pdbart {BayesTree}"><tr><td>pdbart {BayesTree}</td><td align="right">R Documentation</td></tr></table>
<h2>Partial Dependence Plots for BART</h2>


<h3>Description</h3>

<p>
Run <code>bart</code> at test observations constructed so that
a  plot can be created
displaying the effect of
a single variable (<code>pdbart</code>) or pair of variables (<code>pd2bart</code>).
</p>


<h3>Usage</h3>

<pre>
   pdbart(
      x.train, y.train,
      xind=1:ncol(x.train), levs=NULL, levquants=c(.05,(1:9)/10,.95),
      pl=TRUE,  plquants=c(.05,.95), ...)
   ## S3 method for class 'pdbart':
   plot(
      x,
      xind = 1:length(x$fd),
      plquants =c(.05,.95),cols=c('black','blue'), ...)
   pd2bart(
      x.train, y.train,
      xind=1:2, levs=NULL, levquants=c(.05,(1:9)/10,.95),
      pl=TRUE, plquants=c(.05,.95), ...)
   ## S3 method for class 'pd2bart':
   plot(
      x,
      plquants =c(.05,.95), contour.color='white',
      justmedian=TRUE, ...)
</pre>


<h3>Arguments</h3>

<table summary="R argblock">
<tr valign="top"><td><code>x.train</code></td>
<td>
Explanatory variables for training (in sample) data.<br>
Must be a matrix (typeof double)
with (as usual) rows corresponding to observations and columns to variables.<br>
Note that for a categorical variable you need to use dummies and if there
are more than two categories, you need to put all the dummies in
(unlike linear regression).
</td></tr>
<tr valign="top"><td><code>y.train</code></td>
<td>
Dependent variable for training (in sample) data.<br>
Must be a vector (typeof double) with length equal to the number of observations
(equal to the number of rows of x.train).
</td></tr>
<tr valign="top"><td><code>xind</code></td>
<td>
Integer vector indicating which variables are to be plotted.<br>
In <code>pdbart</code>, variables (columns of x.train) for which plot is to be constructed.<br>
In <code>plot.pdbart</code>, indices in list returned by <code>pdbart</code> for which plot is to be constructed.<br>
In <code>pd2bart</code>, integer vector of length 2,
indicating the pair of variables (columns of x.train) to plot.
</td></tr>
<tr valign="top"><td><code>levs</code></td>
<td>
Gives the values of a variable at which the plot is to be constructed.<br>
List, where
<i>i^th</i> component gives the values for <i>i^th</i> variable.<br>
In <code>pdbart</code>, should have same length as xind.<br>
In <code>pd2bart</code>, should have length 2.<br>
See also argument levquants.
</td></tr>
<tr valign="top"><td><code>levquants</code></td>
<td>
If levs in NULL, the values of each variable used in the plot is
set to the quantiles (in x.train) indicated by levquants.<br>
Double vector.
</td></tr>
<tr valign="top"><td><code>pl</code></td>
<td>
For <code>pdbart</code> and <code>pd2bart</code>, if true, plot is made (by calling plot.*).
</td></tr>
<tr valign="top"><td><code>plquants</code></td>
<td>
In the plots, beliefs about <i>f(x)</i> are indicated by plotting the
posterior median and a lower and upper quantile.
plquants is a double vector of length two giving the lower and upper quantiles.
</td></tr>
<tr valign="top"><td><code>...</code></td>
<td>
Additional arguments.<br>
In <code>pdbart</code>,<code>pd2bart</code>, passed on to <code><a href="bart.html">bart</a></code>.<br>
In <code>plot.pdbart</code>, passed on to <code><a href="../../Zelig/html/plot.zelig.html">plot</a></code>.<br>
In <code>plot.pd2bart</code>, passed on to <code><a href="../../xps/html/image-methods.html">image</a></code>
</td></tr>
<tr valign="top"><td><code>x</code></td>
<td>
For plot.*, object returned from pdbart or pd2bart.
</td></tr>
<tr valign="top"><td><code>cols</code></td>
<td>
Vector of two colors.<br>
First color is for median of <i>f</i>, second color is for the upper and lower quantiles.
</td></tr>
<tr valign="top"><td><code>contour.color</code></td>
<td>
Color for contours plotted on top of the image.
</td></tr>
<tr valign="top"><td><code>justmedian</code></td>
<td>
Boolean, if true just one plot is created for
the median of <i>f(x)</i> draws.  If false, three plots are created
one for the median and two additional ones for the lower and upper quantiles.
In this case, mfrow is set to c(1,3).
</td></tr>
</table>

<h3>Details</h3>

<p>
We divide the predictor vector <i>x</i> into a subgroup of interest,
<i>x_s</i> and the complement <i>x_c = x - x_s</i>.
A prediction <i>f(x)</i> can
then be written as <i>f(x_s,x_c)</i>. To estimate the effect of <i>x_s</i>
on the prediction, Friedman suggests the partial dependence
function
</p><p align="center"><i>f_s(x_s) = (1/n) sum_{i=1}^n f(x_s,x_{ic})
</i></p><p>
where <i>x_{ic}</i> is the <i>i^th</i> observation of <i>x_c</i> in the data. Note
that <i>(x_s,x_{ic})</i> will generally not be one of the observed data
points. Using BART it is straightforward to then estimate and even
obtain uncertainty bounds for <i>f_s(x_s)</i>.  A draw of <i>f*_s(x_s)</i>
from the induced BART posterior on <i>f_s(x_s)</i> is obtained by
simply computing <i>f*_s(x_s)</i> as a byproduct of each MCMC draw
<i>f*</i>. The median (or average)
of these MCMC draws <i>f*_s(x_s)</i> then yields an
estimate of <i>f_s(x_s)</i>, and lower and upper quantiles can be used
to obtain intervals for <i>f_s(x_s)</i>.
</p>
<p>
In <code>pdbart</code> <i>x_s</i> consists of a single variable in <i>x</i> and in
<code>pd2bart</code> it is a pair of variables.
</p>
<p>
This is a computationally intensive procedure.
For example, in <code>pdbart</code>, to compute the partial dependence plot
for 5 <i>x_s</i> values, we need
to compute <i>f(x_s,x_c)</i> for all possible <i>(x_s,x_{ic})</i> and there
would be <i>5n</i> of these where <i>n</i> is the sample size.
All of that computation would be done for each kept BART draw.
For this reason running BART with keepevery larger than 1 (eg. 10)
makes the procedure much faster.
</p>


<h3>Value</h3>

<p>
The plot methods produce the plots and don't return anything.
<br>
<code>pdbart</code> and <code>pd2bart</code> return lists with components
given below.  The list returned by <code>pdbart</code> is assigned class
&lsquo;pdbart&rsquo; and the list returned by <code>pd2bart</code> is assigned
class &lsquo;pd2bart&rsquo;.
</p>
<table summary="R argblock">
<tr valign="top"><td><code>fd</code></td>
<td>
A matrix whose <i>(i,j)</i> value is the <i>i^th</i>
draw of <i>f_s(x_s)</i> for the <i>j^th</i> value of <i>x_s</i>.
&ldquo;fd&rdquo; is for &ldquo;function draws&rdquo;.
<br>
For <code>pdbart</code> fd is actually a list whose
<i>k^th</i> component is the matrix described above
corresponding to the <i>k^th</i> variable chosen by argument xind.<br>
The number of columns in each matrix will equal the number of values
given in the corresponding component of argument levs (or number of values in levquants).
<br>
For <code>pd2bart</code>, fd is a single matrix.
The columns correspond to all possible pairs of values for the pair
of variables indicated by xind.
That is, all possible <i>(x_i,x_j)</i> where <i>x_i</i> is a value in
the levs component corresponding to the first <i>x</i> and
<i>x_j</i> is a value in the levs components corresponding to the second one.<br>
The first <i>x</i> changes first.
</td></tr>
<tr valign="top"><td><code>levs</code></td>
<td>
The list of levels used, each component corresponding to a variable.<br>
If argument levs was supplied it is unchanged.<br>
Otherwise, the levels in levs are as constructed using argument levquants.
</td></tr>
<tr valign="top"><td><code>xlbs</code></td>
<td>
vector of character strings which are the plotting labels used for the variables.
</td></tr>
</table>
<p>

<br>
The remaining components returned in the list are the same as in the value of <code><a href="bart.html">bart</a></code>.
They are simply passed on from the BART run used to create the partial dependence plot.
The function <code><a href="bart.html">plot.bart</a></code> can be applied to the object returned by <code>pdbart</code> or
<code>pd2bart</code> to examine the BART run.</p>

<h3>Author(s)</h3>

<p>
Hugh Chipman: <a href="mailto:hugh.chipman@acadiau.ca">hugh.chipman@acadiau.ca</a>.<br>
Robert McCulloch: <a href="mailto:robert.mcculloch@chicagogsb.edu">robert.mcculloch@chicagogsb.edu</a>.
</p>


<h3>References</h3>

<p>
Chipman, H., George, E., and McCulloch, R. (2006)
BART: Bayesian Additive Regression Trees.
</p>
<p>
Chipman, H., George, E., and McCulloch R. (2006)
Bayesian Ensemble Learning.
</p>
<p>
both of the above at:
<a href="http://faculty.chicagogsb.edu/robert.mcculloch/research/rob-mcculloch-cv.html">http://faculty.chicagogsb.edu/robert.mcculloch/research/rob-mcculloch-cv.html</a>
</p>
<p>
Friedman, J.H. (2001)
Greedy function approximation: A gradient boosting machine.
<EM>The Annals of Statistics</EM>, <B>29</B>, 1189&ndash;1232.
</p>


<h3>Examples</h3>

<pre>
##simulate data 
f = function(x) { return(.5*x[,1] + 2*x[,2]*x[,3]) }
sigma=.2 # y = f(x) + sigma*z
n=100 #number of observations
set.seed(27)
x = matrix(2*runif(n*3)-1,ncol=3) ; colnames(x) = c('rob','hugh','ed')
Ey = f(x)
y = Ey +  sigma*rnorm(n)
lmFit = lm(y~.,data.frame(x,y)) #compare lm fit to BART later
par(mfrow=c(1,3)) #first two for pdbart, third for pd2bart
##pdbart: one dimensional partial dependence plot
set.seed(99)
pdb1 = pdbart(x,y,xind=c(1,2),
   levs=list(seq(-1,1,.2),seq(-1,1,.2)),pl=FALSE,
   keepevery=10,ntree=100)
plot(pdb1,ylim=c(-.6,.6))
##pd2bart: two dimensional partial dependence plot
set.seed(99)
pdb2 = pd2bart(x,y,xind=c(2,3),
   levquants=c(.05,.1,.25,.5,.75,.9,.95),pl=FALSE,
   ntree=100,keepevery=10,verbose=FALSE)
plot(pdb2)
##compare BART fit to linear model and truth = Ey
fitmat = cbind(y,Ey,lmFit$fitted,pdb1$yhat.train.mean)
colnames(fitmat) = c('y','Ey','lm','bart')
print(cor(fitmat))
## plot.bart(pdb1) displays the BART run used to get the plot.
</pre>



<hr><div align="center">[Package <em>BayesTree</em> version 0.2-0 <a href="00Index.html">Index]</a></div>

</body></html>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -