step {base} | R Documentation |
Select a formula-based model by AIC.
step(object, scope, scale = 0, direction = c("both", "backward", "forward"), trace = 1, keep = NULL, steps = 1000, k = 2, ...)
object |
an object representing a model of an appropriate class. This is used as the initial model in the stepwise search. |
scope |
defines the range of models examined in the stepwise search. |
scale |
used in the definition of the AIC statistic for selecting the models,
currently only for lm , aov and
glm models.
|
direction |
the mode of stepwise search, can be one of "both" ,
"backward" , or "forward" , with a default of "both" .
If the scope argument is missing,
the default for direction is "backward" .
|
trace |
if positive, information is printed during the running of step .
Larger values may give more detailed information.
|
keep |
a filter function whose input is a fitted model object and the
associated AIC statistic, and whose output is arbitrary.
Typically keep will select a subset of the components of
the object and return them. The default is not to keep anything.
|
steps |
the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. |
k |
the multiple of the number of degrees of freedom used for the penalty.
Only k = 2 gives the genuine AIC: k = log(n) is sometimes
referred to as BIC or SBC.
|
... |
any additional arguments to extractAIC .
|
step
uses add1
and drop1
repeatedly; it will work for any method for which they work, and that
is determined by having a valid method for extractAIC
.
When the additive constant can be chosen so that AIC is equal to
Mallows' Cp, this is done and the tables are labelled
appropriately.
There is a potential problem in using glm
fits with a variable
scale
, as in that case the deviance is not simply related to the
maximized log-likelihood. The function extractAIC.glm
makes the
appropriate adjustment for a gaussian
family, but may need to be
amended for other cases. (The binomial
and poisson
families have fixed scale
by default and do not correspond
to a particular maximum-likelihood problem for variable scale
.)
the stepwise-selected model is returned, with up to two additional
components. There is an "anova"
component corresponding to the
steps taken in the search, as well as a "keep"
component if the
keep=
argument was supplied in the call. The
"Resid. Dev"
column of the analysis of deviance table refers
to a constant minus twice the maximized log likelihood: it will be a
deviance only in cases where a saturated model is well-defined
(thus excluding lm
, aov
and survreg
fits, for example).
The model fitting must apply the models to the same dataset. This
may be a problem if there are missing values and R's default of
na.action = na.omit
is used. We suggest you remove the
missing values first.
This function differs considerably from the function in S, which uses a number of approximations and does not compute the correct AIC.
B. D. Ripley
example(lm) step(lm.D9) data(swiss) summary(lm1 <- lm(Fertility ~ ., data = swiss)) slm1 <- step(lm1) summary(slm1) slm1$anova