Compute estimated marginal means (EMMs) for specified factors or factor combinations in a linear model; and optionally, comparisons or contrasts among them. EMMs are also known as least-squares means.
Usage
emmeans(object, specs, by = NULL, fac.reduce = function(coefs) apply(coefs,
2, mean), contr, options = get_emm_option("emmeans"), weights, offset, ...,
tran)
Arguments
- object
An object of class
emmGrid
; or a fitted model object that is supported, such as the result of a call tolm
orlmer
. Many fitted-model objects are supported; seevignette("models", "emmeans")
for details.- specs
A
character
vector specifying the names of the predictors over which EMMs are desired.specs
may also be aformula
or alist
(optionally named) of validspec
s. Use of formulas is described in the Overview section below. Note: We recommend against using two-sided formulas; see the note below forcontr
.- by
A character vector specifying the names of predictors to condition on.
- fac.reduce
A function that combines the rows of a matrix into a single vector. This implements the “marginal averaging” aspect of EMMs. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If
fac.reduce
is nonlinear, bizarre results are likely, and EMMs will not be interpretable. NOTE: If theweights
argument is non-missing,fac.reduce
is ignored.- contr
A character value or
list
specifying contrasts to be added. Seecontrast
. Note:contr
is ignored whenspecs
is a formula. Note 2:: We recommend against using this argument; obtaining means and obtaining contrasts are two different things, and it is best to do them in separate steps, using thecontrast
function for the contrasts.- options
If non-
NULL
, a namedlist
of arguments to pass toupdate.emmGrid
, just after the object is constructed. (Options may also be included in...
; see the ‘options’ section below.)- weights
Character value, numeric vector, or numeric matrix specifying weights to use in averaging predictions. See “Weights” section below. Also, if
object
is not already a reference grid,weights
(if it is character) is passed toref_grid
aswt.nuis
in case nuisance factors are specified. We can override this by specifyingwt.nuis
explicitly. This more-or-less makes the weighting of nuisance factors consistent with that of primary factors.- offset
Numeric vector or scalar. If specified, this adds an offset to the predictions, or overrides any offset in the model or its reference grid. If a vector of length differing from the number of rows in the result, it is subsetted or cyclically recycled.
- ...
When
object
is not already a"emmGrid"
object, these arguments are passed toref_grid
. Common examples areat
,cov.reduce
,data
,type
,regrid
,df
,nesting
, andvcov.
. Model-type-specific options (seevignette("models", "emmeans")
), commonlymode
, may be used here as well. In addition, if the model formula contains references to variables that are not predictors, you must provide aparams
argument with a list of their names. These arguments may also be used in lieu ofoptions
. See the ‘Options’ section below.- tran
Placeholder to prevent it from being included in
...
. If non-missing, it is added to `options`. See the ‘Options’ section.
Value
When specs
is a character
vector or one-sided formula,
an object of class "emmGrid"
. A number of methods
are provided for further analysis, including
summary.emmGrid
, confint.emmGrid
,
test.emmGrid
, contrast.emmGrid
,
and pairs.emmGrid
.
When specs
is a list
or a formula
having a left-hand
side, the return value is an emm_list
object, which is simply a
list
of emmGrid
objects.
Details
Users should also consult the documentation for ref_grid
,
because many important options for EMMs are implemented there, via the
...
argument.
Overview
Estimated marginal means or EMMs (sometimes called least-squares means) are
predictions from a linear model over a reference grid; or marginal
averages thereof. The ref_grid
function identifies/creates the
reference grid upon which emmeans
is based.
For those who prefer the terms “least-squares means” or
“predicted marginal means”, functions lsmeans
and
pmmeans
are provided as wrappers. See wrappers
.
If specs
is a formula
, it should be of the form ~ specs
,
~ specs | by
, contr ~ specs
, or contr ~ specs | by
. The
formula is parsed and the variables therein are used as the arguments
specs
, by
, and contr
as indicated. The left-hand side is
optional (and we don't recommend it), but if specified it should be the
name of a contrast family (e.g., pairwise
). Operators like
*
or :
are needed in the formula to delineate names, but
otherwise are ignored.
In the special case where the mean (or weighted mean) of all the predictions
is desired, specify specs
as ~ 1
or "1"
.
A number of standard contrast families are provided. They can be identified
as functions having names ending in .emmc
– see the documentation
for emmc-functions
for details – including how to write your
own .emmc
function for custom contrasts.
Weights
If weights
is a vector, its length must equal
the number of predictions to be averaged to obtain each EMM.
If a matrix, each row of the matrix is used in turn, wrapping back to the
first row as needed. When in doubt about what is being averaged (or how
many), first call emmeans
with weights = "show.levels"
.
If weights
is a string, it should partially match one of the following:
"equal"
Use an equally weighted average.
"proportional"
Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.
"outer"
Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins
"cells"
Weight according to the frequencies of the cells being averaged.
"flat"
Give equal weight to all cells with data, and ignore empty cells.
"show.levels"
This is a convenience feature for understanding what is being averaged over. Instead of a table of EMMs, this causes the function to return a table showing the levels that are averaged over, in the order that they appear.
Outer weights are like the 'expected' counts in a chi-square test of
independence, and will yield the same results as those obtained by
proportional averaging with one factor at a time. All except "cells"
uses the same set of weights for each mean. In a model where the predicted
values are the cell means, cell weights will yield the raw averages of the
data for the factors involved. Using "flat"
is similar to
"cells"
, except nonempty cells are weighted equally and empty cells
are ignored.
Offsets
Unlike in ref_grid
, an offset need not be scalar. If not enough values
are supplied, they are cyclically recycled. For a vector of offsets, it is
important to understand that the ordering of results goes with the first
name in specs
varying fastest. If there are any by
factors,
those vary slower than all the primary ones, but the first by
variable
varies the fastest within that hierarchy. See the examples.
Options and ...
Arguments that could go in options
may instead be included in ...
,
typically, arguments such as type
, infer
, etc. that in essence
are passed to summary.emmGrid
. Arguments in both places are
overridden by the ones in ...
.
There is a danger that ...
arguments could partially match those used
by both ref_grid
and update.emmGrid
, creating a conflict.
If these occur, usually they can be resolved by providing complete (or at least
longer) argument names; or by isolating non-ref_grid
arguments in
options
; or by calling ref_grid
separately and passing the
result as object
. See a not-run example below.
Also, when specs
is a two-sided formula, or contr
is specified,
there is potential confusion concerning which ...
arguments
apply to the means, and which to the contrasts. When such confusion is possible,
we suggest doing things separately
(a call to emmeans
with no contrasts, followed by a call to
contrast
). We treat
adjust
as a special case: it is applied to the emmeans
results
only if there are
no contrasts specified, otherwise it is passed only to contrast
.
Examples
warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks)
emmeans (warp.lm, ~ wool | tension)
#> tension = L:
#> wool emmean SE df lower.CL upper.CL
#> A 44.6 3.65 48 37.2 51.9
#> B 28.2 3.65 48 20.9 35.6
#>
#> tension = M:
#> wool emmean SE df lower.CL upper.CL
#> A 24.0 3.65 48 16.7 31.3
#> B 28.8 3.65 48 21.4 36.1
#>
#> tension = H:
#> wool emmean SE df lower.CL upper.CL
#> A 24.6 3.65 48 17.2 31.9
#> B 18.8 3.65 48 11.4 26.1
#>
#> Confidence level used: 0.95
# or equivalently emmeans(warp.lm, "wool", by = "tension")
# 'adjust' argument ignored in emmeans, passed to contrast part...
emmeans (warp.lm, poly ~ tension | wool, adjust = "sidak")
#> $emmeans
#> wool = A:
#> tension emmean SE df lower.CL upper.CL
#> L 44.6 3.65 48 37.2 51.9
#> M 24.0 3.65 48 16.7 31.3
#> H 24.6 3.65 48 17.2 31.9
#>
#> wool = B:
#> tension emmean SE df lower.CL upper.CL
#> L 28.2 3.65 48 20.9 35.6
#> M 28.8 3.65 48 21.4 36.1
#> H 18.8 3.65 48 11.4 26.1
#>
#> Confidence level used: 0.95
#>
#> $contrasts
#> wool = A:
#> contrast estimate SE df t.ratio p.value
#> linear -20.00 5.16 48 -3.878 0.0006
#> quadratic 21.11 8.93 48 2.363 0.0439
#>
#> wool = B:
#> contrast estimate SE df t.ratio p.value
#> linear -9.44 5.16 48 -1.831 0.1412
#> quadratic -10.56 8.93 48 -1.182 0.4272
#>
#> P value adjustment: sidak method for 2 tests
#>
# 'adjust' argument NOT ignored ...
emmeans (warp.lm, ~ tension | wool, adjust = "sidak")
#> wool = A:
#> tension emmean SE df lower.CL upper.CL
#> L 44.6 3.65 48 35.53 53.6
#> M 24.0 3.65 48 14.98 33.0
#> H 24.6 3.65 48 15.53 33.6
#>
#> wool = B:
#> tension emmean SE df lower.CL upper.CL
#> L 28.2 3.65 48 19.20 37.2
#> M 28.8 3.65 48 19.76 37.8
#> H 18.8 3.65 48 9.76 27.8
#>
#> Confidence level used: 0.95
#> Conf-level adjustment: sidak method for 3 estimates
if (FALSE) { # \dontrun{
### Offsets: Consider a silly example:
emmeans(warp.lm, ~ tension | wool, offset = c(17, 23, 47)) @ grid
# note that offsets are recycled so that each level of tension receives
# the same offset for each wool.
# But using the same offsets with ~ wool | tension will probably not
# be what you want because the ordering of combinations is different.
} # }