Functions for coded data
coded.data.Rd
These functions facilitate the use of coded data in response-surface analysis.
Usage
coded.data(data, ..., formulas = list(...), block = "block")
as.coded.data(data, ..., formulas = list(...), block = "block")
decode.data(data)
recode.data(data, ..., formulas = list(...))
val2code(X, codings)
code2val(X, codings)
# S3 method for coded.data
print(x, ..., decode = TRUE)
### --- Methods for managing coded data ---
is.coded.data(x)
# S3 method for coded.data
[(x, ...)
codings(object)
# S3 method for coded.data
codings(object)
codings(object) <- value
# S3 method for coded.data
names(x) <- value
## Generic method for true variable names (i.e. decoded names)
truenames(x)
# S3 method for coded.data
truenames(x)
## Generic replacement method for truenames
truenames(x) <- value
# S3 method for coded.data
truenames(x) <- value
Arguments
- data
A
data.frame
- formulas
List of coding formulas; see details
- block
Name(s) of blocking variable(s). It is
pmatch
ed (case insensitively) with names indata
to identify blocking factorss- X
A vector, matrix, or data.frame to be coded or decoded.
- codings
A list of formulas; see Details
- decode
Logical. If
TRUE
, the decoded values are displayed; ifFALSE
, the codings are displayed.- object
A
coded.data
object- x
A
coded.data
object- value
Replacement value for
<-
methods- ...
In
coded.data
,as.coded.data
, andrecode.data
,...
allows specifying formulas as arguments rather than as a list. In other functions,...
is passed to the parent methods.
Details
Typically, coding formulas are of the form x ~ (var - center) / halfwd
where x
and var
are variable names, and center
and
halfwd
are numbers.
The left-hand side gives the name of the coded variable, and the right-hand side
should be a linear expression in the uncoded variable (linearity is not explicitly
checked, but nonlinear expressions will not decode correctly.) If coded.data
is called without formulas, automatic codings are created (along with a warning message). Automatic codings are based on transforming all non-block variables having five or fewer unique values to the interval [-1,1]. If no formulas are provided in as.coded.data
, default coding formulas like those for cube
are created all numeric variables with mean zero -- again with a warning message.
An S3 print
method is provided for the coded.data
class;
it displays the data.frame in either coded or
decoded form, along with the coding formulas. Some users may prefer print.data.frame
or as.data.frame
in lieu of print
with decode=FALSE; they produce the
same output without displaying the coding formulas.
Use coded.data
to convert a data.frame
in which the variables
are on their original scales. The variables named in the formulas are
coded and replaced with their coded versions (and also renamed).
In contrast, as.coded.data
does not modify any of the data; it assumes the variables
are already coded, and the coding information is simply added. In addition, if data
is
already a coded.data
object from a pre-1.41 version of rsm,
it is converted to
be compatible with new capabilities such as djoin
(no formulas
argument
is needed in this case). Any blocking factors should be specified in the blocks
argument.
decode.data
converts a dataset of class coded.data
and
returns a data.frame
containing the original variables.
recode.data
is used to convert a coded.data
object to new codings.
Important: this changes the coded values to match the new coding formulas. If you want to keep the coded values the same, but change the levels they represent, use codings(object) <- \dots or dupe
.
code2val
converts coded values to the original scale using the codings provided,
and returns an object of the same class as X
.
val2code
converts the other direction. When using these functions, it is
essential that the names (or column names in the case of matrices) match those of the
corresponding coded or uncoded variables.
codings
is a generic function for accessing codings. It
returns the list of coding formulas from a coded.data
object. One may use an
expression like codings(object) <- list(\dots) to change the codings (without changing
the coded values themselves). See also codings.rsm
.
is.coded.data(x)
returns TRUE
if x
inherits from coded.data
, and FALSE otherwise.
The extraction function x[...]
and the naming functions names<-
,
truenames
, and truenames<-
are provided to preserve the integrity of
codings. For example, if x[, 1:3]
excludes any coded columns, their coding formulas
are also excluded. If all coded columns are excluded, the return value is unclassed
from coded.data
. When variable names are changed using names(x) <- ...
, the coding
formulas are updated accordingly. The truenames
function returns the names of the
variables in the decoded dataset. We can change the decoded names using
truenames(x) <- ...
, and the coding formulas are updated. Note that truenames
and truenames<-
work the same as names
and names<-
for
unencoded variables in the object.
Another convenient way to copy and change the coding formulas a coded dataset (and optionally re-randomize it) is to use the dupe
function with a coding
argument.
When a design is created in another package, some of the variables may be factor
s, in which case they are converted using as.numeric
(values of 1, 2, ...). These levels may be regarded as a yet different coding of the variables, and so it may take two steps to get it in the desired form: one to convert the supplied levels to the desired range (often -1 to 1), and the other to replace the coding formulas to correspond to the real values of the variables to be used. See the examples.
Value
coded.data
, as.coded.data
, and recode.data
return an object of class
coded.data
, which inherits from data.frame
. A coded.data
object is stored in coded form, and its names
attribute contains the coded names,
where they apply. Thus, when fitting models in rsm
or lm
with
coded data as the data
argument, the model formula should be given in terms of the
coded variables.
Note
Starting with rsm version 2.00, the coded.data
class involves additional attributes to serve broader needs in design-generation. Because of this, old coded.data
objects may need to be updated using as.coded.data
if they are to be used with the newer functions such as djoin
.
See also
data.frame
, djoin
, dupe
, rsm
References
Lenth RV (2009). ``Response-Surface Methods in R, Using rsm'', Journal of Statistical Software, 32(7), 1--17. doi:10.18637/jss.v032.i07
Examples
library(rsm)
### Existing dataset with variables on actual scale
CR <- coded.data (ChemReact, x1 ~ (Time - 85)/5, x2 ~ (Temp - 175)/5)
CR # same as print(CR, decode = TRUE)
#> Time Temp Block Yield
#> 1 80.00 170.00 B1 80.5
#> 2 80.00 180.00 B1 81.5
#> 3 90.00 170.00 B1 82.0
#> 4 90.00 180.00 B1 83.5
#> 5 85.00 175.00 B1 83.9
#> 6 85.00 175.00 B1 84.3
#> 7 85.00 175.00 B1 84.0
#> 8 85.00 175.00 B2 79.7
#> 9 85.00 175.00 B2 79.8
#> 10 85.00 175.00 B2 79.5
#> 11 92.07 175.00 B2 78.4
#> 12 77.93 175.00 B2 75.6
#> 13 85.00 182.07 B2 78.5
#> 14 85.00 167.93 B2 77.0
#>
#> Data are stored in coded form using these coding formulas ...
#> x1 ~ (Time - 85)/5
#> x2 ~ (Temp - 175)/5
print(CR, decode = FALSE) # similar to as.data.frame(CR)
#> x1 x2 Block Yield
#> 1 -1.000 -1.000 B1 80.5
#> 2 -1.000 1.000 B1 81.5
#> 3 1.000 -1.000 B1 82.0
#> 4 1.000 1.000 B1 83.5
#> 5 0.000 0.000 B1 83.9
#> 6 0.000 0.000 B1 84.3
#> 7 0.000 0.000 B1 84.0
#> 8 0.000 0.000 B2 79.7
#> 9 0.000 0.000 B2 79.8
#> 10 0.000 0.000 B2 79.5
#> 11 1.414 0.000 B2 78.4
#> 12 -1.414 0.000 B2 75.6
#> 13 0.000 1.414 B2 78.5
#> 14 0.000 -1.414 B2 77.0
#>
#> Variable codings ...
#> x1 ~ (Time - 85)/5
#> x2 ~ (Temp - 175)/5
code2val (c(x1=.5, x2=-1), codings = codings(CR))
#> Time Temp
#> 87.5 170.0
### Existing dataset, already in coded form
CO <- as.coded.data(codata, x1 ~ (Ethanol - 0.2)/0.1, x2 ~ A.F.ratio - 15)
truenames(CO)
#> [1] "Ethanol" "A.F.ratio" "y"
names(CO)
#> [1] "x1" "x2" "y"
# revert x2 to an uncoded variable
codings(CO)[2] <- NULL
truenames(CO)
#> [1] "Ethanol" "x2" "y"
### Import a design that is coded in a different way
if (require(conf.design)) { # ----- This example requires conf.design -----
# First, generate a 3^3 in blocks and import it via coded.data
des3 <- coded.data(conf.design(p=3, G=c(1,1,2)))
# NOTE: This returns a warning message but does the right thing --
# It generates these names and coding formulas automatically:
# x1 ~ (T1 - 2)/1
# x2 ~ (T2 - 2)/1
# x3 ~ (T3 - 2)/1
# Now randomize and change the codings and variable names for the real situation:
mydes <- dupe(des3, coding = c(x1 ~ (Dose - 20)/5, x2 ~ (Conc - 40)/10,
x3 ~ (Time - 60)/15))
} # ----- end of example requiring package conf.design -----
#> Loading required package: conf.design
#> Warning: Automatic codings created -- may not be what you want