Estimability Tools — nonest.basis • estimability

This documents the functions needed to test estimability of linear functions of regression coefficients.

Usage

nonest.basis(x, ...)
# S3 method for default
nonest.basis(x, ...)
# S3 method for qr
nonest.basis(x, ...)
# S3 method for matrix
nonest.basis(x, ...)
# S3 method for lm
nonest.basis(x, ...)
# S3 method for svd
nonest.basis(x, tol = 5e-8, ...)

legacy.nonest.basis(x, ...)

all.estble

is.estble(x, nbasis, tol = 1e-8)

Arguments

x: For nonest.basis, an object of a class in methods("nonest.basis"). Or, in is.estble, a numeric vector or matrix for assessing estimability of sum(x * beta), where beta is the vector of regression coefficients.
nbasis: Matrix whose columns span the null space of the model matrix. Such a matrix is returned by nonest.basis.
tol: Numeric tolerance for assessing rank or nonestimability. For determining rank, singular values less than tol times the largest singular value are regarded as zero. For determining estimability with a nonzero $x$, $\beta'x$ is assessed by whether or not $||N'x||^2 < \tau ||x'x||^2$, where $N$ and $\tau$ denote nbasis and tol, respectively.
...: Additional arguments passed to other methods.

Details

Consider a linear model $y = X\beta + E$. If $X$ is not of full rank, it is not possible to estimate $\beta$ uniquely. However, $X\beta$ is uniquely estimable, and so is $a'X\beta$ for any conformable vector $a$. Since $a'X$ comprises a linear combination of the rows of $X$, it follows that we can estimate any linear function where the coefficients lie in the row space of $X$. Equivalently, we can check to ensure that the coefficients are orthogonal to the null space of $X$.

The nonest.basis method for class 'svd' is not really functional as a method because there is no "svd" class (at least in R <= 4.2.0). But the function nonest.basis.svd is exported and may be called directly; it works with results of svd or La.svd. We require x$v to be the complete matrix of right singular values; but we do not need x$u at all.

The default method does serve as an svd method, in that it only works if x has the required elements of an SVD result, in which case it passes it to nonest.basis.svd. The matrix method runs nonest.basis.svd(svd(x, nu = 0)). The lm method runs the qr method on x$qr.

The function legacy.nonest.basis is the original default method in early versions of the estimability package. It may be called with x being either a matrix or a qr object, and after obtaining the R matrix, it uses an additional QR decomposition of t(R) to obtain the needed basis. (The current nonest.basis method for qr objects is instead based on the singular-value decomposition of R, and requires much simpler code.)

The constant all.estble is simply a 1 x 1 matrix of NA. This specifies a trivial non-estimability basis, and using it as nbasis will cause everything to test as estimable.

Value

When $X$ is not full-rank, the methods for nonest.basis return a basis for the null space of $X$. The number of rows is equal to the number of regression coefficients (including any NAs); and the number of columns is equal to the rank deficiency of the model matrix. The columns are orthonormal. If the model is full-rank, then nonest.basis returns all.estble. The matrix method uses $X$ itself, the qr method uses the $QR$ decomposition of $X$, and the lm method recovers the required information from the object.

The function is.estble returns a logical value (or vector, if x is a matrix) that is TRUE if the function is estimable and FALSE if not.

References

Monahan, John F. (2008) A Primer on Linear Models, CRC Press. (Chapter 3)

Author

Russell V. Lenth <russell-lenth@uiowa.edu>

Examples

require(estimability)

X <- cbind(rep(1,5), 1:5, 5:1, 2:6)
( nb <- nonest.basis(X) )
#>            [,1]        [,2]
#> [1,]  0.4171348 -0.89010713
#> [2,] -0.7077494 -0.26855996
#> [3,] -0.1606977  0.08879245
#> [4,]  0.5470517  0.35735241

SVD <- svd(X, nu = 0)    # we don't need the U part of UDV'
nonest.basis.svd(SVD)    # same result as above
#>            [,1]        [,2]
#> [1,]  0.4171348 -0.89010713
#> [2,] -0.7077494 -0.26855996
#> [3,] -0.1606977  0.08879245
#> [4,]  0.5470517  0.35735241

# Test estimability of some linear functions for this X matrix
lfs <- rbind(c(1,4,2,5), c(2,3,9,5), c(1,2,2,1), c(0,1,-1,1))
is.estble(lfs, nb)
#> [1]  TRUE  TRUE FALSE  TRUE

# Illustration on 'lm' objects:
warp.lm1 <- lm(breaks ~ wool * tension, data = warpbreaks, 
    subset = -(26:38), 
    contrasts = list(wool = "contr.treatment", tension = "contr.treatment"))
zapsmall(warp.nb1 <- nonest.basis(warp.lm1))
#>            [,1]
#> [1,]  0.0000000
#> [2,] -0.5773503
#> [3,]  0.0000000
#> [4,]  0.0000000
#> [5,]  0.5773503
#> [6,]  0.5773503

warp.lm2 <- update(warp.lm1, 
    contrasts = list(wool = "contr.sum", tension = "contr.helmert"))
zapsmall(warp.nb2 <- nonest.basis(warp.lm2))
#>            [,1]
#> [1,] -0.3779645
#> [2,]  0.3779645
#> [3,]  0.5669467
#> [4,]  0.1889822
#> [5,] -0.5669467
#> [6,] -0.1889822

# These bases look different, but they both correctly identify the empty cell
wcells = with(warpbreaks, expand.grid(wool = levels(wool), tension = levels(tension)))
epredict(warp.lm1, newdata = wcells, nbasis = warp.nb1)
#>        1        2        3        4        5        6 
#> 44.55556       NA 24.00000 27.28571 25.71429 18.77778 
epredict(warp.lm2, newdata = wcells, nbasis = warp.nb2)
#>        1        2        3        4        5        6 
#> 44.55556       NA 24.00000 27.28571 25.71429 18.77778