📄 loglm.html
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>R: Fit Log-Linear Models by Iterative Proportional Scaling</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" type="text/css" href="../../R.css">
</head><body>
<table width="100%" summary="page for loglm {MASS}"><tr><td>loglm {MASS}</td><td align="right">R Documentation</td></tr></table>
<h2>Fit Log-Linear Models by Iterative Proportional Scaling</h2>
<h3>Description</h3>
<p>
This function provides a front-end to the standard function,
<code>loglin</code>, to allow log-linear models to be specified and fitted
in a manner similar to that of other fitting functions, such as
<code>glm</code>.
</p>
<h3>Usage</h3>
<pre>
loglm(formula, data, subset, na.action, ...)
</pre>
<h3>Arguments</h3>
<table summary="R argblock">
<tr valign="top"><td><code>formula</code></td>
<td>
A linear model formula specifying the log-linear model.
<br>
If the left-hand side is empty, the <code>data</code> argument is required
and must be a (complete) array of frequencies. In this case the
variables on the right-hand side may be the names of the
<code>dimnames</code> attribute of the frequency array, or may be the
positive integers: 1, 2, 3, ... used as alternative names for the
1st, 2nd, 3rd, ... dimension (classifying factor).
If the left-hand side is not empty it specifies a vector of
frequencies. In this case the data argument, if present, must be
a data frame from which the left-hand side vector and the
classifying factors on the right-hand side are (preferentially)
obtained. The usual abbreviation of a <code>.</code> to stand for ‘all
other variables in the data frame’ is allowed. Any non-factors
on the right-hand side of the formula are coerced to factor.
</td></tr>
<tr valign="top"><td><code>data</code></td>
<td>
Numeric array or data frame. In the first case it specifies the
array of frequencies; in then second it provides the data frame
from which the variables occurring in the formula are
preferentially obtained in the usual way.
<br>
This argument may be the result of a call to <code><a href="../../stats/html/xtabs.html">xtabs</a></code>.
</td></tr>
<tr valign="top"><td><code>subset</code></td>
<td>
Specifies a subset of the rows in the data frame to be used. The
default is to take all rows.
</td></tr>
<tr valign="top"><td><code>na.action</code></td>
<td>
Specifies a method for handling missing observations. The
default is to fail if missing values are present.
</td></tr>
<tr valign="top"><td><code>...</code></td>
<td>
May supply other arguments to the function <code><a href="loglm1.html">loglm1</a></code>.
</td></tr>
</table>
<h3>Details</h3>
<p>
If the left-hand side of the formula is empty the <code>data</code> argument
supplies the frequency array and the right-hand side of the
formula is used to construct the list of fixed faces as required
by <code>loglin</code>. Structural zeros may be specified by giving a
<code>start</code> argument with those entries set to zero, as described in
the help information for <code>loglin</code>.
</p>
<p>
If the left-hand side is not empty, all variables on the
right-hand side are regarded as classifying factors and an array
of frequencies is constructed. If some cells in the complete
array are not specified they are treated as structural zeros.
The right-hand side of the formula is again used to construct the
list of faces on which the observed and fitted totals must agree,
as required by <code>loglin</code>. Hence terms such as
<code>a:b</code>, <code>a*b</code> and <code>a/b</code> are all equivalent.
</p>
<h3>Value</h3>
<p>
An object of class <code>"loglm"</code> conveying the results of the fitted
log-linear model. Methods exist for the generic functions <code>print</code>,
<code>summary</code>, <code>deviance</code>, <code>fitted</code>, <code>coef</code>,
<code>resid</code>, <code>anova</code> and <code>update</code>, which perform the expected
tasks. Only log-likelihood ratio tests are allowed using <code>anova</code>.
<br>
The deviance is simply an alternative name for the log-likelihood
ratio statistic for testing the current model within a saturated
model, in accordance with standard usage in generalized linear
models.</p>
<h3>Warning</h3>
<p>
If structural zeros are present, the calculation of degrees of
freedom may not be correct. <code>loglin</code> itself takes no action to
allow for structural zeros. <code>loglm</code> deducts one degree of
freedom for each structural zero, but cannot make allowance for
gains in error degrees of freedom due to loss of dimension in the
model space. (This would require checking the rank of the
model matrix, but since iterative proportional scaling methods
are developed largely to avoid constructing the model matrix
explicitly, the computation is at least difficult.)
</p>
<p>
When structural zeros (or zero fitted values) are present the
estimated coefficients will not be available due to infinite
estimates. The deviances will normally continue to be correct, though.
</p>
<h3>References</h3>
<p>
Venables, W. N. and Ripley, B. D. (2002)
<EM>Modern Applied Statistics with S.</EM> Fourth edition. Springer.
</p>
<h3>See Also</h3>
<p>
<code><a href="loglm1.html">loglm1</a></code>, <code><a href="../../stats/html/loglin.html">loglin</a></code>
</p>
<h3>Examples</h3>
<pre>
# The data frames Cars93, minn38 and quine are available
# in the MASS package.
# Case 1: frequencies specified as an array.
sapply(minn38, function(x) length(levels(x)))
## hs phs fol sex f
## 3 4 7 2 0
minn38a <- array(0, c(3,4,7,2), lapply(minn38[, -5], levels))
minn38a[data.matrix(minn38[,-5])] <- minn38$fol
fm <- loglm(~1 + 2 + 3 + 4, minn38a) # numerals as names.
deviance(fm)
##[1] 3711.9
fm1 <- update(fm, .~.^2)
fm2 <- update(fm, .~.^3, print = TRUE)
## 5 iterations: deviation 0.0750732
anova(fm, fm1, fm2)
## Not run: LR tests for hierarchical log-linear models
Model 1:
~ 1 + 2 + 3 + 4
Model 2:
. ~ 1 + 2 + 3 + 4 + 1:2 + 1:3 + 1:4 + 2:3 + 2:4 + 3:4
Model 3:
. ~ 1 + 2 + 3 + 4 + 1:2 + 1:3 + 1:4 + 2:3 + 2:4 + 3:4 +
1:2:3 + 1:2:4 + 1:3:4 + 2:3:4
Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
Model 1 3711.915 155
Model 2 220.043 108 3491.873 47 0.00000
Model 3 47.745 36 172.298 72 0.00000
Saturated 0.000 0 47.745 36 0.09114
## End(Not run)
# Case 1. An array generated with xtabs.
loglm(~ Type + Origin, xtabs(~ Type + Origin, Cars93))
## Not run: Call:
loglm(formula = ~Type + Origin, data = xtabs(~Type + Origin,
Cars93))
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 18.362 5 0.0025255
Pearson 14.080 5 0.0151101
## End(Not run)
# Case 2. Frequencies given as a vector in a data frame
names(quine)
## [1] "Eth" "Sex" "Age" "Lrn" "Days"
fm <- loglm(Days ~ .^2, quine)
gm <- glm(Days ~ .^2, poisson, quine) # check glm.
c(deviance(fm), deviance(gm)) # deviances agree
## [1] 1368.7 1368.7
c(fm$df, gm$df) # resid df do not!
c(fm$df, gm$df.residual) # resid df do not!
## [1] 127 128
# The loglm residual degrees of freedom is wrong because of
# a non-detectable redundancy in the model matrix.
</pre>
<hr><div align="center">[Package <em>MASS</em> version 7.2-44 <a href="00Index.html">Index]</a></div>
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -