📄 partialplot
字号:
partialPlot package:randomForest R Documentation
_P_a_r_t_i_a_l _d_e_p_e_n_d_e_n_c_e _p_l_o_t
_D_e_s_c_r_i_p_t_i_o_n:
Partial dependence plot gives a graphical depiction of the
marginal effect of a variable on the class probability
(classification) or response (regression).
_U_s_a_g_e:
## S3 method for class 'randomForest':
partialPlot(x, pred.data, x.var, which.class,
w, plot = TRUE, add = FALSE,
n.pt = min(length(unique(pred.data[, xname])), 51),
rug = TRUE, xlab=deparse(substitute(x.var)), ylab="",
main=paste("Partial Dependence on", deparse(substitute(x.var))),
...)
_A_r_g_u_m_e_n_t_s:
x: an object of class 'randomForest', which contains a 'forest'
component.
pred.data: a data frame used for contructing the plot, usually the
training data used to contruct the random forest.
x.var: name of the variable for which partial dependence is to be
examined.
which.class: For classification data, the class to focus on (default
the first class).
w: weights to be used in averaging; if not supplied, mean is not
weighted
plot: whether the plot should be shown on the graphic device.
add: whether to add to existing plot ('TRUE').
n.pt: if 'x.var' is continuous, the number of points on the grid
for evaluating partial dependence.
rug: whether to draw hash marks at the bottom of the plot
indicating the deciles of 'x.var'.
xlab: label for the x-axis.
ylab: label for the y-axis.
main: main title for the plot.
...: other graphical parameters to be passed on to 'plot' or
'lines'.
_D_e_t_a_i_l_s:
The function being plotted is defined as:
tilde{f}(x) = frac{1}{n} sum_{i=1}^n f(x, x_{iC}),
where x is the variable for which partial dependence is sought,
and x_{iC} is the other variables in the data. The summand is the
predicted regression function for regression, and logits (i.e.,
log of fraction of votes) for 'which.class' for classification:
f(x) = log p_k(x) - frac{1}{K} sum_{j=1}^K log p_j(x),
where K is the number of classes, k is 'which.class', and p_j is
the proportion of votes for class j.
_V_a_l_u_e:
A list with two components: 'x' and 'y', which are the values used
in the plot.
_N_o_t_e:
The 'randomForest' object must contain the 'forest' component;
i.e., created with 'randomForest(..., keep.forest=TRUE)'.
This function runs quite slow for large data sets.
_A_u_t_h_o_r(_s):
Andy Liaw andy_liaw@merck.com
_R_e_f_e_r_e_n_c_e_s:
Friedman, J. (2001). Greedy function approximation: the gradient
boosting machine, _Ann. of Stat._
_S_e_e _A_l_s_o:
'randomForest'
_E_x_a_m_p_l_e_s:
data(airquality)
airquality <- na.omit(airquality)
set.seed(131)
ozone.rf <- randomForest(Ozone ~ ., airquality)
partialPlot(ozone.rf, airquality, Temp)
data(iris)
set.seed(543)
iris.rf <- randomForest(Species~., iris)
partialPlot(iris.rf, iris, Petal.Width, "versicolor")
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -