📄 pdbart

📁 为一个贝叶斯分类的学习算法实现,是基于linux系统下的c++实现
💻
字号:
pdbart               package:BayesTree               R Documentation

_P_a_r_t_i_a_l _D_e_p_e_n_d_e_n_c_e _P_l_o_t_s _f_o_r _B_A_R_T

_D_e_s_c_r_i_p_t_i_o_n:

     Run 'bart' at test observations constructed so that a  plot can be
     created displaying the effect of a single variable ('pdbart') or
     pair of variables ('pd2bart').

_U_s_a_g_e:

        pdbart(
           x.train, y.train,
           xind=1:ncol(x.train), levs=NULL, levquants=c(.05,(1:9)/10,.95),
           pl=TRUE,  plquants=c(.05,.95), ...)
        ## S3 method for class 'pdbart':
        plot(
           x,
           xind = 1:length(x$fd),
           plquants =c(.05,.95),cols=c('black','blue'), ...)
        pd2bart(
           x.train, y.train,
           xind=1:2, levs=NULL, levquants=c(.05,(1:9)/10,.95),
           pl=TRUE, plquants=c(.05,.95), ...)
        ## S3 method for class 'pd2bart':
        plot(
           x,
           plquants =c(.05,.95), contour.color='white',
           justmedian=TRUE, ...)

_A_r_g_u_m_e_n_t_s:

 x.train: Explanatory variables for training (in sample) data.
           Must be a matrix (typeof double) with (as usual) rows
          corresponding to observations and columns to variables.
           Note that for a categorical variable you need to use dummies
          and if there are more than two categories, you need to put
          all the dummies in (unlike linear regression). 

 y.train: Dependent variable for training (in sample) data.
           Must be a vector (typeof double) with length equal to the
          number of observations (equal to the number of rows of
          x.train). 

    xind: Integer vector indicating which variables are to be plotted.
           In 'pdbart', variables (columns of x.train) for which plot
          is to be constructed.
           In 'plot.pdbart', indices in list returned by 'pdbart' for
          which plot is to be constructed.
           In 'pd2bart', integer vector of length 2, indicating the
          pair of variables (columns of x.train) to plot. 

    levs: Gives the values of a variable at which the plot is to be
          constructed.
           List, where i^th component gives the values for i^th
          variable.
           In 'pdbart', should have same length as xind.
           In 'pd2bart', should have length 2.
           See also argument levquants. 

levquants: If levs in NULL, the values of each variable used in the
          plot is set to the quantiles (in x.train) indicated by
          levquants.
           Double vector. 

      pl: For 'pdbart' and 'pd2bart', if true, plot is made (by calling
          plot.*). 

plquants: In the plots, beliefs about f(x) are indicated by plotting
          the posterior median and a lower and upper quantile. plquants
          is a double vector of length two giving the lower and upper
          quantiles. 

     ...: Additional arguments.
           In 'pdbart','pd2bart', passed on to 'bart'.
           In 'plot.pdbart', passed on to 'plot'.
           In 'plot.pd2bart', passed on to 'image' 

       x: For plot.*, object returned from pdbart or pd2bart. 

    cols: Vector of two colors.
           First color is for median of f, second color is for the
          upper and lower quantiles. 

contour.color: Color for contours plotted on top of the image. 

justmedian: Boolean, if true just one plot is created for the median of
          f(x) draws.  If false, three plots are created one for the
          median and two additional ones for the lower and upper
          quantiles. In this case, mfrow is set to c(1,3). 

_D_e_t_a_i_l_s:

     We divide the predictor vector x into a subgroup of interest, x_s
     and the complement x_c = x - x_s. A prediction f(x) can then be
     written as f(x_s,x_c). To estimate the effect of x_s on the
     prediction, Friedman suggests the partial dependence function

              f_s(x_s) = (1/n) sum_{i=1}^n f(x_s,x_{ic})

     where x_{ic} is the i^th observation of x_c in the data. Note that
     (x_s,x_{ic}) will generally not be one of the observed data
     points. Using BART it is straightforward to then estimate and even
     obtain uncertainty bounds for f_s(x_s).  A draw of f*_s(x_s) from
     the induced BART posterior on f_s(x_s) is obtained by simply
     computing f*_s(x_s) as a byproduct of each MCMC draw f*. The
     median (or average) of these MCMC draws f*_s(x_s) then yields an
     estimate of f_s(x_s), and lower and upper quantiles can be used to
     obtain intervals for f_s(x_s).

     In 'pdbart' x_s consists of a single variable in x and in
     'pd2bart' it is a pair of variables.

     This is a computationally intensive procedure. For example, in
     'pdbart', to compute the partial dependence plot for 5 x_s values,
     we need to compute f(x_s,x_c) for all possible (x_s,x_{ic}) and
     there would be 5n of these where n is the sample size. All of that
     computation would be done for each kept BART draw. For this reason
     running BART with keepevery larger than 1 (eg. 10) makes the
     procedure much faster.

_V_a_l_u_e:

     The plot methods produce the plots and don't return anything.

     'pdbart' and 'pd2bart' return lists with components given below. 
     The list returned by 'pdbart' is assigned class 憄dbart
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -