📄 pdbart
字号:
pdbart package:BayesTree R Documentation
_P_a_r_t_i_a_l _D_e_p_e_n_d_e_n_c_e _P_l_o_t_s _f_o_r _B_A_R_T
_D_e_s_c_r_i_p_t_i_o_n:
Run 'bart' at test observations constructed so that a plot can be
created displaying the effect of a single variable ('pdbart') or
pair of variables ('pd2bart').
_U_s_a_g_e:
pdbart(
x.train, y.train,
xind=1:ncol(x.train), levs=NULL, levquants=c(.05,(1:9)/10,.95),
pl=TRUE, plquants=c(.05,.95), ...)
## S3 method for class 'pdbart':
plot(
x,
xind = 1:length(x$fd),
plquants =c(.05,.95),cols=c('black','blue'), ...)
pd2bart(
x.train, y.train,
xind=1:2, levs=NULL, levquants=c(.05,(1:9)/10,.95),
pl=TRUE, plquants=c(.05,.95), ...)
## S3 method for class 'pd2bart':
plot(
x,
plquants =c(.05,.95), contour.color='white',
justmedian=TRUE, ...)
_A_r_g_u_m_e_n_t_s:
x.train: Explanatory variables for training (in sample) data.
Must be a matrix (typeof double) with (as usual) rows
corresponding to observations and columns to variables.
Note that for a categorical variable you need to use dummies
and if there are more than two categories, you need to put
all the dummies in (unlike linear regression).
y.train: Dependent variable for training (in sample) data.
Must be a vector (typeof double) with length equal to the
number of observations (equal to the number of rows of
x.train).
xind: Integer vector indicating which variables are to be plotted.
In 'pdbart', variables (columns of x.train) for which plot
is to be constructed.
In 'plot.pdbart', indices in list returned by 'pdbart' for
which plot is to be constructed.
In 'pd2bart', integer vector of length 2, indicating the
pair of variables (columns of x.train) to plot.
levs: Gives the values of a variable at which the plot is to be
constructed.
List, where i^th component gives the values for i^th
variable.
In 'pdbart', should have same length as xind.
In 'pd2bart', should have length 2.
See also argument levquants.
levquants: If levs in NULL, the values of each variable used in the
plot is set to the quantiles (in x.train) indicated by
levquants.
Double vector.
pl: For 'pdbart' and 'pd2bart', if true, plot is made (by calling
plot.*).
plquants: In the plots, beliefs about f(x) are indicated by plotting
the posterior median and a lower and upper quantile. plquants
is a double vector of length two giving the lower and upper
quantiles.
...: Additional arguments.
In 'pdbart','pd2bart', passed on to 'bart'.
In 'plot.pdbart', passed on to 'plot'.
In 'plot.pd2bart', passed on to 'image'
x: For plot.*, object returned from pdbart or pd2bart.
cols: Vector of two colors.
First color is for median of f, second color is for the
upper and lower quantiles.
contour.color: Color for contours plotted on top of the image.
justmedian: Boolean, if true just one plot is created for the median of
f(x) draws. If false, three plots are created one for the
median and two additional ones for the lower and upper
quantiles. In this case, mfrow is set to c(1,3).
_D_e_t_a_i_l_s:
We divide the predictor vector x into a subgroup of interest, x_s
and the complement x_c = x - x_s. A prediction f(x) can then be
written as f(x_s,x_c). To estimate the effect of x_s on the
prediction, Friedman suggests the partial dependence function
f_s(x_s) = (1/n) sum_{i=1}^n f(x_s,x_{ic})
where x_{ic} is the i^th observation of x_c in the data. Note that
(x_s,x_{ic}) will generally not be one of the observed data
points. Using BART it is straightforward to then estimate and even
obtain uncertainty bounds for f_s(x_s). A draw of f*_s(x_s) from
the induced BART posterior on f_s(x_s) is obtained by simply
computing f*_s(x_s) as a byproduct of each MCMC draw f*. The
median (or average) of these MCMC draws f*_s(x_s) then yields an
estimate of f_s(x_s), and lower and upper quantiles can be used to
obtain intervals for f_s(x_s).
In 'pdbart' x_s consists of a single variable in x and in
'pd2bart' it is a pair of variables.
This is a computationally intensive procedure. For example, in
'pdbart', to compute the partial dependence plot for 5 x_s values,
we need to compute f(x_s,x_c) for all possible (x_s,x_{ic}) and
there would be 5n of these where n is the sample size. All of that
computation would be done for each kept BART draw. For this reason
running BART with keepevery larger than 1 (eg. 10) makes the
procedure much faster.
_V_a_l_u_e:
The plot methods produce the plots and don't return anything.
'pdbart' and 'pd2bart' return lists with components given below.
The list returned by 'pdbart' is assigned class 憄dbart
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -