📄 pa 765 discriminant function analysis.mht

📁 这是博弈论算法全集第六部分:局面描述,其它算法将陆续推出.以便与大家共享
💻 MHT
📖 第 1 页 / 共 4 页
字号:
    variables' contributions are significant. Wilks's lambda is =
sometimes called=20
    <I>the U statistic</I>.=20
    <P><B>Wilks's lambda </B>is also used in a second context of =
discriminant=20
    analysis, to test the significance of the discriminant function as a =
whole.=20
    <UL>
      <P></P></UL>
    <P></P></LI></UL>
  <P><A name=3Dassoc></A></P>
  <LI><B>Measuring strength of relationships</B>=20
  <UL>
    <P><A name=3Dtable></A>
    <LI>The <B>classification table</B>, also called a confusion, =
assignment, or=20
    prediction matrix or table, is used to assess the performance of DA. =
This is=20
    simply a table in which the rows are the observed categories of the=20
    dependent and the columns are the predicted categories of the =
dependents.=20
    When prediction is perfect, all cases will lie on the diagonal. The=20
    percentage of cases on the diagonal is the percentage of correct=20
    classifications. This percentage is called the <B>hit ratio</B>.=20
    <P><A name=3Ddsquare></A></P>
    <LI><B>Mahalanobis D-Square</B> and <B>Rao's V</B> are two other =
indexes of=20
    the extent to which the discriminant functions discriminate between=20
    criterion groups.=20
    <P><A name=3Dcanonic></A></P>
    <LI><B>Canonical correlation, R<SUB>c</SUB>: </B>Squared canonical=20
    correlation, R<SUB>c</SUB><SUP>2</SUP>, is the percent of variation =
in the=20
    dependent discriminated by the set of independents in DA.=20
    <P></P></LI></UL>
  <P><A name=3Dinterp></A></P>
  <LI><B>Interpreting the discriminant functions</B>=20
  <UL>
    <P><A name=3Dmatrix></A>
    <LI>The <B>structure matrix table</B> shows the correlations of each =

    variable with each discriminant function. These simple Pearsonian=20
    correlations are called <B>structure coefficients or =
correlations</B> or=20
    <B>discriminant loadings</B>. When the dependent has more than two=20
    categories there will be more than one discriminant function. In =
that case,=20
    there will be multiple columns in the table, one for each function. =
The=20
    correlations then serve like factor loadings in factor analysis -- =
that is,=20
    by identifying the largest absolute correlations associated with =
each=20
    discriminant function the researcher gains insight into how to name =
each=20
    function.=20
    <P><A name=3Dstructure></A><I>Structure coefficients vs. =
standardized=20
    discriminant function coefficients.</I> The standardized =
discriminant=20
    function coefficients indicate the partial contribution of each =
variable to=20
    the discriminant function(s), controlling for other independents =
entered in=20
    the equation. The structure coefficients indicate the simple =
correlations=20
    between the variables and the discriminant function or functions. =
The=20
    structure coefficients should be used to assign meaningful labels to =
the=20
    discriminant functions. The standardized discriminant function =
coefficients=20
    should be used to assess each independent variable's unique =
contribution to=20
    the discriminant function.=20
    <P><A name=3DMahal></A></P>
    <LI><B>Mahalanobis distances</B> are used in analyzing cases in =
discriminant=20
    analysis. For instance, one might wish to analyze a new, unknown set =
of=20
    cases in comparison to an existing set of known cases. Mahalanobis =
distance=20
    is the distance between a case and the centroid for each group (of =
the=20
    dependent) in attribute space (n-dimensional space defined by n =
variables).A=20
    case will have one Mahalanobis distance for each group, and it will =
be=20
    classified as belonging to the group for which its Mahalanobis =
distance is=20
    smallest. Thus, the smaller the Mahalanobis distance, the closer the =
case is=20
    to the group centroid and the more likely it is to be classed as =
belonging=20
    to that group. Since Mahalanobis distance is measured in terms of =
standard=20
    deviations from the centroid, therefore a case which is less than =
1.96=20
    Mahalanobis distance units from the centroid has less than .05 =
chance of=20
    belonging to the group represented by the centroid; 3 units would =
likewise=20
    correspond to less than .01 chance. SPSS reports squared Mahalanobis =

    distance. </LI></UL>
  <P><A name=3Dassume></A></P>
  <LI><B>Tests of Assumptions</B>=20
  <P>
  <UL><A name=3Dboxm></A>
    <LI><B>Box's M</B> tests the null hypothesis that the covariance =
matrices do=20
    not differ between groups formed by the dependent. This is an =
assumption of=20
    discriminant analysis. The researcher wants this test <U>not</U> to =
be=20
    significant, so as to accept the null hypothesis that the groups do =
not=20
    differ. This test is very sensitive to meeting also the assumption =
of=20
    multivariate normality. Note, though, that DA can be robust even =
when this=20
    assumption is violated. </LI></UL>
  <P><A name=3Dvalidity></A></P>
  <LI><B>Validation</B>=20
  <UL>
    <P><A name=3Dholdout></A>
    <LI>A <B>hold-out sample</B> is often used for validation of the=20
    discriminant function. This is a split halves test, were a portion =
of the=20
    cases are assigned to the <I>analysis sample</I> for purposes of =
training=20
    the discriminant function, then it is validated by assessing its =
performance=20
    on the remaining cases in the hold-out sample. </LI></UL>
  <P></P></LI></UL>
<P><BR><A name=3DSPSS></A>
<H2>SPSS Output Examples</H2>
<UL>
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim2.htm">Discriminan=
t=20
  Function Analysis (two groups)</A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim3.htm">Multiple=20
  Discriminant Function Analysis (three groups)</A> </LI></UL>
<P><BR><A name=3Dassumptions></A>
<H2>Assumptions</H2>
<UL>
  <LI>The dependent variable is a true dichotomy. When the range of a =
true=20
  underlying continuous variable is constrained to form a dichotomy, =
correlation=20
  is attenuated (biased toward underestimation). One should never =
dichotomize a=20
  continuous variable simply for the purpose of applying discriminant =
function=20
  analysis.=20
  <P></P>
  <LI>All cases must be independent and must belong to a group formed by =
the=20
  dependent variable. The groups must be mutually exclusive, with every =
case=20
  belonging to only one group.=20
  <P></P>
  <LI>Group sizes of the dependent are not grossly different.=20
  <P></P>
  <LI>There must be at least two cases for each category of the =
dependent.=20
  <P></P>
  <LI>The independent variable is or variables are interval. As with =
other=20
  members of the regression family, dichotomies, dummy variables, and =
ordinal=20
  variables with at least 5 categories are commonly used as well.=20
  <P></P>
  <LI>The maximum number of independent variables is n-2, where n is the =
sample=20
  size.=20
  <P></P>
  <LI>No independents have a zero standard deviation in one or more of =
the=20
  groups formed by the dependent.=20
  <P></P>
  <LI>Errors (residuals) are randomly distributed.=20
  <P><A name=3Dhov1></A></P>
  <LI>Homogeneity of variances (homoscedasticity): within each group =
formed by=20
  the dependent, the variance of each interval independent should be =
similar=20
  between groups. That is, the independents may (and will) have =
different=20
  variances one from another, but for the same independent, the groups =
formed by=20
  the dependent should have similar variances and means on that =
independent.=20
  Discriminant analysis is highly sensitive to outliers. Lack of =
homogeneity of=20
  variances may indicate the presence of outliers in one or more groups. =

  <P><A name=3Dhoc1></A></P>
  <LI>Homogeneity of covariances/correlations: within each group formed =
by the=20
  dependent, the covariance/correlation between any two predictor =
variables=20
  should be similar to the corresponding covariance/correlation in other =
groups.=20
  That is, each group has a similar covariance/correlation matrix.=20
  <P></P>
  <LI>Absence of perfect multicollinearity. If one independents is very =
highly=20
  correlated with another, or one is a function (ex., the sum) of other=20
  independents, then the tolerance value for that variable will approach =
0 and=20
  the matrix will not have a unique discriminant solution. Such a matrix =
is said=20
  to be <I>ill-conditioned</I>. Tolerance is discussed in the section on =
<A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/regress.htm#toleranc">reg=
ression</A>.=20

  <P></P>
  <LI>Low multicollinearity of the independents. To the extent that =
independents=20
  are correlated, the standardized discriminant function coefficients =
will not=20
  reliably assess the relative importance of the predictor variables.=20
  <P></P>
  <LI>Assumes linearity (does not take into account exponential terms =
unless=20
  such transformed variables are added as additional independents).=20
  <P></P>
  <LI>Assumes additivity (does not take into account interaction terms =
unless=20
  new crossproduct variables are added as additional independents).=20
  <P></P>
  <LI>For purposes of significance testing, predictor variables follow=20
  multivariate normal distributions. That is, each predictor variable =
has a=20
  normal distribution about fixed values of all the other independents. =
</LI></UL>
<P><BR>
<H2>Frequently Asked Questions</H2>
<UL>
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#cluster"><B>I=
sn't=20
  discriminant analysis the same as cluster analysis?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#constant"><B>=
When=20
  does the discriminant function have no constant term?</B></A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#hov"><B>How=20
  important is it that the assumptions of homogeneity of variances and =
of=20
  multivariate normal distribution be met?</B></A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#betas"><B>In =

  DA, how can you assess the relative importance of the discriminating=20
  variables?</B></A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#mle"><B>What =

  is the maximum likelihood estimation method in discriminant analysis =
(logistic=20
  discriminate function analysis)?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#fisher"><B>Wh=
at are=20
  Fisher's linear discriminant functions? </B></A>
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#step"><B>What=
=20
  is stepwise DA?</B></A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/discrim.htm#mancova"><B>I=
=20
  have heard DA is related to MANCOVA. How so?</B></A>=20
  <P><BR><A name=3Dcluster></A></P>
  <LI><B>Isn't discriminant analysis the same as cluster analysis?</B>=20
💿 文件大小 800 K
👤 上传用户 zhuxiaobei123
📂 所属分类软件设计/软件工程
🏷️ 相关标签

#算法 #博弈论 #分 #家
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -