initRandrot | R Documentation |
Initialization of a linear model for subsequent generation of randomly
rotated data (randrot
) associated with
the null hypothesis H_{0}: \beta_{coef.h} = 0
.
Basics of rotation tests are found in \insertCiteHettegger2021randRotation and
\insertCiteLangsrud2005randRotation.
initRandrot(Y = NULL, X = NULL, coef.h = NULL, weights = NULL, cormat = NULL)
initBatchRandrot(
Y = NULL,
X = NULL,
coef.h = NULL,
batch = NULL,
weights = NULL,
cormat = NULL
)
## S4 method for signature 'list'
initBatchRandrot(
Y = NULL,
X = Y$design,
coef.h = NULL,
batch = NULL,
weights = Y$weights,
cormat = NULL
)
## S4 method for signature 'list'
initRandrot(
Y = NULL,
X = Y$design,
coef.h = NULL,
weights = Y$weights,
cormat = NULL
)
Y |
a data matrix with |
X |
the design matrix of the experiment with |
coef.h |
single integer or vector of integers specifying the "hypothesis
coefficients" ( |
weights |
numerical matrix of finite positive weights > 0 (as in
weighted least squares regression. Dimensions must be equal to dimensions
of |
cormat |
the sample correlation matrix with |
batch |
Batch covariate of the same length as |
This function performs basic initial checks and preparatory calculations for random rotation data generation. Nomenclature of variables is mainly as in \insertCiteLangsrud2005randRotation and \insertCiteHettegger2021randRotation. See also package vignette for application examples.
Y
can also be a list with elements E
, design
and
weights
. Y$E
is thereby used as Y
, Y$design
is
used as X
and Y$weights
is used as weights
. By this, the
functions are compatible with results from e.g. voom
(limma
package), see Examples
.
coef.h
specifies the model coefficients associated with the null
hypothesis ("hypothesis coefficients"). All other model coefficients are
considered as "determined coefficients" coef.d
\insertCiteLangsrud2005randRotation. The design matrix is rearranged so
that coef.h
correspond to the last columns of the design matrix and
coef.d
correspond to the first columns of the design matrix. This
is necessary for adequate transformation of the combined null-hypothesis
H_{0}: \beta_{coef.h} = 0
by QR decomposition.
If X[,coef.d]
does not have full rank, a warning is generated and
coef.d
is set to coef.d <- seq_len(qr(X[,coef.d])$rank)
.
Weights must be finite positive numerics
greater zero. This is
necessary for model (QR) decomposition and for back transformation of the
rotated data into the original variance structure, see also
randrot
. Weights as estimated e.g. by
voom \insertCiteLaw2014randRotation are suitable and can be used without
further processing. Note that due to the whitening transformation (i.e. by
using the arguments weights
and/or cormat
) the rank of the
transformed (whitened) design matrix X
could change (become smaller),
which could become dangerous for the fitting procedures. If you get errors
using weights
and/or cormat
, try the routine without using
weights
and/or cormat
to exclude this source of errors.
The following section provides a brief summary how rotations are calculated.
A more general introduction is given in
\insertCiteLangsrud2005randRotation. For reasons of readability, we omit
writing %*%
for matrix multiplication and write *
for
transposed matrix. The rotation is done by multiplying the features x
samples
data matrix Y
with the transpose of the restricted random
rotation matrix Rt
Rt = Xd Xd* + [Xh Xe] R [Xh Xe]*
with R
being a (reduced) random rotation matrix and Xd
,
Xh
and Xe
being columns of the full QR decomposition of the
design matrix X
. [Xd Xh Xe] = qr.Q(qr(X), complete = TRUE)
,
where Xd
correspond to columns coef.d
, Xh
to columns
coef.h
and Xe
to the remaining columns.
If weights
and/or cormat
are specified, each feature
Y[i,]
and the design matrix X
are whitening transformed before
rotation. The whitening matrix T
is defined as T = solve(C) w
,
where solve(C)
is the inverse Cholesky decompostion of the correlation
matrix (cormat = CC*
) and w
is a diagonal matrix of the square
roots of the sample weights for the according feature (w =
diag(sqrt(weights[i,]
))).
The rotated data for one feature y.r[i,]
is thus calculated as
y.r[i,] = ( solve(T) Rt T (y[i,])* )*
and [Xd Xh Xe] =
qr.Q(qr(TX), complete = TRUE)
For weights = NULL
and cormat = NULL
, T
is the identity
matrix.
Note that a separate QR decomposition is calculated for each feature if
weights
are specified. The restricted random orthogonal matrix
Rt
is calculated with the same reduced random orthogonal matrix R
for all features.
When using initBatchRandrot
, initRandrot
is called
for each batch separately. When using initBatchRandrot
with
cormat
, cormat
needs to be a list of correlation matrices
with one matrix for each batch. Note that this implicitly assumes a block
design of the sample correlation matrix, where sample correlation
coefficients between batches are zero. For a more general sample
correlation matrix, allowing non-zero sample correlation coefficients
between batches, see package vignette. Batches are split according to
split(seq_along(batch), batch)
.
An initialised
initRandrot
,
initRandrotW
or
initBatchRandrot
object.
Peter Hettegger
randrot
,
rotateStat
# For further examples see '?rotateStat' and package vignette.
# Example 1: Compatibility with limma::voom
## Not run:
v <- voom(counts, design)
ir <- initRandrot(v)
## End(Not run)
# Example 2:
#set.seed(0)
# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100
# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))
mod1 <- model.matrix(~phenotype, pdata)
# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1
# See '?rotateStat'
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.