⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 pa 765 logistic regression.mht

📁 介绍各种经典算法的代码。说明详细
💻 MHT
📖 第 1 页 / 共 5 页
字号:
multiple=20
  linear regression?</B></A>=20
  <LI><A =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#vif"><B>What=
=20
  is the logistic equivalent to the VIF test for multicollinearity in =
OLS=20
  regression? Can odds ratios be used?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#assumpt"><B>=
How=20
  does one test to see if the assumption of linearity in the logit is =
met for=20
  each of the independents?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#specificatio=
n"><B>How=20
  can one use estimated variance of residuals to test for model=20
  misspecification?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#interact"><B=
>How=20
  are interaction effects handled in logistic regression?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#stepwise"><B=
>Does=20
  stepwise logistic regression exist, as it does for OLS =
regression?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#residual"><B=
>Does=20
  analysis of residuals work in logistic regression as it does in =
OLS?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#indeps"><B>H=
ow many=20
  independents can I have?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#paramet"><B>=
How do=20
  I express the logistic regression equation if one or more of my =
independents=20
  is categorical?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#multgroups">=
<B>How=20
  do I compare logit coefficients across groups formed by a categorical=20
  independent variable?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#confiden"><B=
>How do=20
  I compute the confidence interval for the unstandardized logit =
(effect)=20
  coefficients?</B></A>=20
  <LI><A=20
  =
href=3D"http://www2.chass.ncsu.edu/garson/pa765/logistic.htm#sas"><B>SAS'=
s PROC=20
  CATMOD for multinomial logistic regression is not user friendly. Where =
can I=20
  get some help?</B></A>=20
  <P><BR><BR><A name=3Dregress></A></P>
  <LI><B>Why not just use regression with dichotomous dependents?</B>=20
  <UL>When the dependent is binary, the distribution of residual error =
is=20
    heteroscedastic, which violates one of the assumptions of regression =

    analysis. Likewise, when the dependent variable is binary it is not =
normally=20
    distributed, so OLS estimates of the sum of squares are mis-leading =
and=20
    therefore significance tests and the standard error of regression =
are wrong.=20
    Also, for a dependent which assumes values of 0 and 1, the =
regression model=20
    will allow estimates below 0 and above 1. Also, multiple linear =
regression=20
    does not handle non-linear relationships, whereas log-linear methods =
do.=20
    These objections to the use of regression with dichotomous =
dependents apply=20
    to polytomous dependents also. </UL>
  <P><A name=3Dspss></A></P>
  <LI><B>What is the SPSS syntax for logistic regression?</B>=20
  <UL>With SPSS 10, logistic regression is found under Analyze - =
Regression -=20
    Binary Logistic or Multinomial Logistic.=20
    <P><PRE>LOGISTIC REGRESSION /VARIABLES income WITH age SES gender =
opinion1
     opinion2 region
    /CATEGORICAL=3Dgender, opinion1, opinion2, region
    /CONTRAST(region)=3DINDICATOR(4)
    /METHOD FSTEP(LR)
    /CLASSPLOT
</PRE>Above is the SPSS syntax in simplified form. The dependent =
variable is=20
    the variable immediately after the VARIABLES term. The independent =
variables=20
    are those immediately after the WITH term. The CATEGORICAL command =
specifies=20
    any categorical variables; note these must also be listed in the =
VARIABLES=20
    statement. The CONTRAST command tells SPSS which category of a =
categorical=20
    variable is to be dropped when it automatically constructs dummy =
variables=20
    (here it is the 4th value of "region"; this value is the fourth one =
and is=20
    <U>not</U> necessarily coded "4"). The METHOD subcommand sets the =
method of=20
    computation, here specified as FSTEP to indicate forward stepwise =
logistic=20
    regression. Alternatives are BSTEP (backward stepwise logistic =
regression)=20
    and ENTER (enter terms as listed, usually because their order is set =
by=20
    theories which the researcher is testing). ENTER is the default =
method. The=20
    (LR) term following FSTEP specifies that likelihood ratio criteria =
are to be=20
    used in the stepwise addition of variables to the model. The =
/CLASSPLOT=20
    option specifies a histogram of predicted probabilities is to output =
(see=20
    above). </UL>
  <P><A name=3Dcatvars></A></P>
  <LI><B>Will SPSS's logistic regression procedure handly my categorical =

  variables automatically?</B>=20
  <UL>No, at least through Version 8. You must declare your categorical=20
    variables categorical if they have more than two values. </UL>
  <P><A name=3Dmissing></A></P>
  <LI><B>Can I handle missing cases the same in logistic regression as =
in OLS=20
  regression?</B>=20
  <UL>No. In the linear model assumed by OLS regression, one may choose =
to=20
    estimate missing values based on OLS regression of the variable with =
missing=20
    cases, based on non-missing data. However, the nonlinear model =
assumed by=20
    logistic regression requires a full set of data. Therefore SPSS =
provides=20
    only for LISTWISE deletion of cases with missing data, using the =
remaining=20
    full dataset to calculate logistic parameters. </UL>
  <P><A name=3Dbeta></A></P>
  <LI><B>Is it true for logistic regression, as it is for OLS =
regression, that=20
  the beta weight (standardized logit coefficient) for a given =
independent=20
  reflects its explanatory power controlling for other variables in the=20
  equation, and that the betas will change if variables are added or =
dropped=20
  from the equation?</B>=20
  <UL>Yes, the same basic logic applies. This is why it is best in =
either form=20
    of regression to compare two or more models for their relative fit =
to the=20
    data rather than simply to show the data are not inconsistent with a =
single=20
    model. The model, of course, dictates which variables are entered =
and one=20
    uses the ENTER method in SPSS, which is the default method. </UL>
  <P><A name=3Drsquare></A></P>
  <LI><B>What is the coefficient in logistic regression which =
corresponds to=20
  R-Square in multiple regression?</B>=20
  <UL>There is no exactly analogous coefficient. See the discussion of =
R<FONT=20
    size=3D-2>L</FONT>-squared, above. <I>Cox and Snell's R-Square</I> =
is an=20
    attempt to imitate the interpretation of multiple R-Square, and=20
    <I>Nagelkerke's R-Square</I> is a further modification of the Cox =
and Snell=20
    coefficient to assure that it can vary from 0 to 1. </UL>
  <P><A name=3Darsquare></A></P>
  <LI><B>Is there a logistic regression analogy to adjusted R-square in =
OLS=20
  regression?</B>=20
  <UL>Yes. <B>R<FONT size=3D-2>LA</FONT>-squared</B> is adjusted R<FONT=20
    size=3D-2>L</FONT>-squared, and is similar to adjusted R-square in =
OLS=20
    regression. R<FONT size=3D-2>LA</FONT>-squared penalizes R<FONT=20
    size=3D-2>L</FONT>-squared for the number of independents on the =
assumption=20
    that R-square will become artificially high simply because some=20
    independents' chance variations "explain" small parts of the =
variance of the=20
    dependent. R<FONT size=3D-2>LA</FONT>-squared =3D (G<FONT =
size=3D-2>M</FONT> -=20
    2k)/D<FONT size=3D-2>O</FONT>, where k =3D the number of =
independents. </UL>
  <P>
  <P><A name=3Dmulticol></A></P>
  <LI><B>Is multicollinearity a problem for logistic regression the way =
it is=20
  for multiple linear regression?</B>=20
  <UL>Absolutely. The discussion in "Statnotes" under the "Regression" =
topic=20
    is relevant to logistic regression. </UL>
  <P>
  <P><A name=3Dvif></A></P>
  <LI><B>What is the logistic equivalent to the VIF test for =
multicollinearity=20
  in OLS regression? Can odds ratios be used?</B>=20
  <UL>The variance inflation factor (VIF) is indeed a problem when high =
in OLS=20
    regression. VIF is the reciprocal of tolerance, which is 1 - =
R-squared. When=20
    there is high multicollinearity, R-squared will be high also, so =
tolerance=20
    will be low, and thus VIF will be high. When VIF is high, the b and =
beta=20
    weights are unreliable and subject to misinterpretation. For typical =
social=20
    science research, where R-squared is often not higher than .75, =
inflation of=20
    the standard error of b (or beta) will be no higher than about 50%.=20
    <P>As there is no direct counterpart to R-squared in logistic =
regression,=20
    VIF is not computed (that I have seen, though obviously one could =
apply the=20
    same logic to various psuedo-R-squared measures).=20
    <P>The odds ratio is a measure of association, consisting of one =
odds=20
    divided by another odds. Odds ratios below 1.0 are associated with =
decreases=20
    in the dependent variable, while odds ratios above 1.0 are =
associated with=20
    increases. Note the asymmetry: 0 - 1 for decreases, 1 - inifinity =
for=20
    increases. To eliminate this asymmetry, we compute the logit of the=20
    dependent variable, which is the natural logarithm of the odds =
ratio.=20
    Logit(Y) becomes negative and increasingly large in magnitude as the =
odds=20
    ratio decreases from 1 to 0, and becomes increasingly large and =
positive as=20
    the odds ratio increases from 1 to infinity.=20
    <P>When the logit is very high in one direction or the other, then =
the=20
    higher is the association of the independent variable with the =
dependent. To=20
    compare across the effects of multiple independents, one would use =
the=20
    standardized logit, much like betas in OLS regression. =
Interpretation of=20
    this could be unreliable if multicollinearity is high.=20
    <P>To the extent that one independent is linearly relation to =
another=20
    independent, multicollinearity could be a problem in logistic =
regression.=20
    However, unlike OLS regression, logistic regression does not assume=20
    linearity of relationship among independents. The Box-Tidwell =
transformation=20
    and orthogonal polynomial contrasts are ways of testing linearity =
among the=20
    independents.=20
    <P>A high odds ratio would not be evidence of multicollinearity in =
itself.=20
    Unfortunately, I am not aware of a VIF-type test for logistic =
regression,=20
    and I would think that the same obstacles would exist as for =
creating a true=20
    equivalent to OLS R-squared. </P></UL>
  <P><A name=3Dassumpt></A></P>
  <LI><B>How does one test to see if the assumption of linearity in the =
logit is=20
  met for each of the independents?</B>=20
  <UL>
    <LI><B>Box-Tidwell Transformation</B>: Add to the logistic model =
interaction=20
    terms which are the crossproduct of each independent times its =
natural=20
    logarithm [(X)ln(X)]. If these transformations are significant, then =
there=20
    is nonlinearity in the logit. This method is not sensitive to small=20
    nonlinearities.=20
    <P></P>
    <LI><B>Orthogonal polynomial contrasts</B>, an option in SPSS, may =
be used.=20
    This option treats each independent as a categorical variable and =
computes=20
    logit (effect) coefficients for each category, testing for linear,=20
    quadratic, cubic, or higher-order effects. The logit should not =
change over=20
    the contrasts. This method is not appropriate when the independent =
has a=20
    large number of values, inflating the standard errors of the =
contrasts.=20
    <P></P></LI></UL><A name=3Dspecification></A>
  <LI><B>How can one use estimated variance of residuals to test for =
model=20
  misspecification?</B>=20
  <UL>
    <LI>The misspecification problem may be assessed by comparing =
expected=20
    variance of residuals with observed variance. Since logistic =
regression=20
    assumes binomial errors, the estimated variance (y) =3D m(1 - m), =
where m =3D=20
    estimated mean residual. "Overdispersion" is when the observed =
variance of=20
    the residu

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -