The make.tran
function creates the needed information to perform
transformations of the response
variable, including inverting the transformation and estimating variances of
back-transformed predictions via the delta method. make.tran
is
similar to make.link
, but it covers additional transformations.
The result can be used as an environment in which the model is fitted, or as
the tran
argument in update.emmGrid
(when the given
transformation was already applied in an existing model).
Usage
make.tran(type = c("genlog", "power", "boxcox", "sympower", "asin.sqrt",
"atanh", "bcnPower", "scale"), alpha = 1, beta = 0, param, y, inner, ...)
inverse(y)
Arguments
- type
The name of a standard transformation supported by
stat::make.link
, or of a special transformation described under Details.- alpha, beta
Numeric parameters needed for special transformations.
- param
If non-missing, this specifies either
alpha
orc(alpha, beta)
(provided for backward compatibility). Also, for the same reason, ifalpha
is of length more than 1, it is taken asparam
.- y
A numeric response variable used (and required) with
type = "scale"
, wherescale(y)
determinesalpha
andbeta
.- inner
another transformation. See the section on compound transformations
- ...
Additional arguments passed to other functions/methods
Value
A list
having at least the same elements as those returned by
make.link
. The linkfun
component is the transformation
itself. Each of the functions is associated with an environment where any
parameter values are defined.
inverse
returns the reciprocal of its argument. It allows
the "inverse"
link to be auto-detected as a response transformation.
Note
The genlog
transformation is technically unneeded, because
a response transformation of the form log(y + c)
is now auto-detected
by ref_grid
.
We modify certain make.link
results in transformations
where there is a restriction on valid prediction values, so that reasonable
inverse predictions are obtained, no matter what. For example, if a
sqrt
transformation was used but a predicted value is negative, the
inverse transformation is zero rather than the square of the prediction. A
side effect of this is that it is possible for one or both confidence
limits, or even a standard error, to be zero.
Details
The make.tran
function returns a
suitable list of functions for several popular transformations. Besides being
usable with update
, the user may use this list as an enclosing
environment in fitting the model itself, in which case the transformation is
auto-detected when the special name linkfun
(the transformation
itself) is used as the response transformation in the call. See the examples
below.
The primary purpose of make.tran
is to support transformations that
require additional parameters, specified as alpha
and beta
;
these are the onse shown in the argument-matching list. However, standard
transformations supported by stats::make.link
are also supported.
In the following discussion of ones requiring parameters,
we use \(\alpha\) and \(\beta\) to
denote alpha
and beta
, and \(y\) to denote the response variable.
The type
argument specifies the following transformations:
"genlog"
Generalized logarithmic transformation: \(\log_\beta(y + \alpha)\), where \(y > -\alpha\). When \(\beta = 0\) (the default), we use \(\log_e(y + \alpha)\)
"power"
Power transformation: \((y-\beta)^\alpha\), where \(y > \beta\). When \(\alpha = 0\), \(\log(y-\beta)\) is used instead.
"boxcox"
The Box-Cox transformation (unscaled by the geometric mean): \(((y - \beta)^\alpha - 1) / \alpha\), where \(y > \beta\). When \(\alpha = 0\), \(\log(y - \beta)\) is used.
"sympower"
A symmetrized power transformation on the whole real line: \(|y - \beta|^\alpha\cdot sign(y - \beta)\). There are no restrictions on \(y\), but we require \(\alpha > 0\) in order for the transformation to be monotone and continuous.
"asin.sqrt"
Arcsin-square-root transformation: \(\sin^{-1}(y/\alpha)^{1/2}\). Typically,
alpha
will be either 1 (default) or 100."atanh"
Arctanh transformation: \(\tanh^{-1}(y/\alpha)\). Typically,
alpha
will be either 1 (default) or 100."bcnPower"
Box-Cox with negatives allowed, as described for the
bcnPower
function in the car package. It is defined as the Box-Cox transformation \((z^\alpha - 1) / \alpha\) of the variable \(z = y + (y^2+\beta^2)^{1/2}\). Note that this requires both parameters and thatbeta > 0
."scale"
This one is a little different than the others, in that
alpha
andbeta
are ignored; instead, they are determined by callingscale(y, ...)
. The user should give asy
the response variable in the model to be fitted to its scaled version.
Note that with the "power"
, "boxcox"
, or "sympower"
transformations,
the argument beta
specifies a location shift.
In the "genpower"
transformation, beta
specifies
the base of the logarithm – however, quirkily, the default of beta = 0
is taken to be the natural logarithm. For example,
make.tran(0.5, 10)
sets up the \(\log_{10}(y + \frac12)\)
transformation. In the "bcnPower"
transformation, beta
must be specified as a positive value.
For purposes of back-transformation, the sqrt(y) + sqrt(y+1) transformation is treated exactly the same way as 2*sqrt(y), because both are regarded as estimates of \(2\sqrt\mu\).
Cases where make.tran
may not be needed
For standard transformations with no parameters, we usually don't need to use
make.tran
; just the name of the transformation is all that is needed.
The functions emmeans
, ref_grid
, and related ones
automatically detect response transformations that are recognized by
examining the model formula. These are log
, log2
, log10
,
log1p
,
sqrt
, logit
, probit
, cauchit
, cloglog
; as
well as (for a response variable y
) asin(sqrt(y))
,
asinh(sqrt(y))
, atanh(y)
, and sqrt(y) + sqrt(y+1)
.
In addition, any
constant multiple of these (e.g., 2*sqrt(y)
) is auto-detected and
appropriately scaled (see also the tran.mult
argument in
update.emmGrid
).
A few additional transformations may be specified as character strings and
are auto-detected: "identity"
, "1/mu^2"
,
"inverse"
, "reciprocal"
, "log10"
, "log2"
,
"asin.sqrt"
, "asinh.sqrt"
, and "atanh"
.
Compound transformations
A transformation that is a function of another function can be created by
specifying inner
for the other function. For example, the
transformation \(1/\sqrt{y}\) can be created either by
make.tran("inverse", inner = "sqrt")
or by make.tran("power",
-0.5)
. In principle, transformations can be compounded to any depth.
Also, if type
is "scale"
, y
is replaced by
inner$linkfun(y)
, because that will be the variable that is scaled.
Examples
# Fit a model using an oddball transformation:
bctran <- make.tran("boxcox", 0.368)
warp.bc <- with(bctran,
lm(linkfun(breaks) ~ wool * tension, data = warpbreaks))
# Obtain back-transformed LS means:
emmeans(warp.bc, ~ tension | wool, type = "response")
#> wool = A:
#> tension response SE df lower.CL upper.CL
#> L 42.4 4.48 48 34.0 52.0
#> M 23.1 3.05 48 17.5 29.8
#> H 23.3 3.07 48 17.7 30.0
#>
#> wool = B:
#> tension response SE df lower.CL upper.CL
#> L 27.2 3.38 48 20.9 34.6
#> M 27.9 3.44 48 21.5 35.3
#> H 18.4 2.65 48 13.6 24.3
#>
#> Confidence level used: 0.95
#> Intervals are back-transformed from the Box-Cox (lambda = 0.368) scale
### Using a scaled response...
# Case where it is auto-detected:
mod <- lm(scale(yield[, 1]) ~ Variety, data = MOats)
emmeans(mod, "Variety", type = "response")
#> Variety response SE df lower.CL upper.CL
#> Golden Rain 80.0 7.96 15 63.0 97.0
#> Marvellous 86.7 7.96 15 69.7 103.6
#> Victory 71.5 7.96 15 54.5 88.5
#>
#> Confidence level used: 0.95
#> Intervals are back-transformed from the scale(79.4, 19.4) scale
# Case where scaling is not auto-detected -- and what to do about it:
copt <- options(contrasts = c("contr.sum", "contr.poly"))
mod.aov <- aov(scale(yield[, 1]) ~ Variety + Error(Block), data = MOats)
emm.aov <- suppressWarnings(emmeans(mod.aov, "Variety", type = "response"))
#> NOTE: Unable to recover scale() parameters. See '? make.tran'
# Scaling was not retrieved, but we can do:
emm.aov <- update(emm.aov, tran = make.tran("scale", y = MOats$yield[, 1]))
emmeans(emm.aov, "Variety", type = "response")
#> Variety response SE df lower.CL upper.CL
#> Golden Rain 80.0 7.96 9.8 62.2 97.8
#> Marvellous 86.7 7.96 9.8 68.9 104.5
#> Victory 71.5 7.96 9.8 53.7 89.3
#>
#> Warning: EMMs are biased unless design is perfectly balanced
#> Confidence level used: 0.95
#> Intervals are back-transformed from the scale(79.4, 19.4) scale
### Compound transformations
# The following amount to the same thing:
t1 <- make.tran("inverse", inner = "sqrt")
t2 <- make.tran("power", -0.5)
options(copt)
if (FALSE) { # \dontrun{
### An existing model 'mod' was fitted with a y^(2/3) transformation...
ptran = make.tran("power", 2/3)
emmeans(mod, "treatment", tran = ptran)
} # }
pigs.lm <- lm(inverse(conc) ~ source + factor(percent), data = pigs)
emmeans(pigs.lm, "source", type = "response")
#> source response SE df lower.CL upper.CL
#> fish 29.7 0.816 23 28.1 31.5
#> soy 39.0 1.440 23 36.2 42.2
#> skim 43.8 1.900 23 40.1 48.1
#>
#> Results are averaged over the levels of: percent
#> Confidence level used: 0.95
#> Intervals are back-transformed from the inverse scale