R/aesPC_unknown_matrixNorm.R
In pathwayPCA: Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

Documented in normalize

#' Normalize and reconstruct the eigenvalues of a data matrix for supervised PCA
#'
#' @description Normalize the columns of a project matrix. For each eigenvector,
#'    swap the signs of the vector elements if the first entry is negative. See
#'    "Details" for more information.
#'
#' @param B A projection matrix: often the matrix of the left singular vectors
#'    given by the Singular Value Decomposition of a data matrix or Grammian.
#' @param d The number of columns of \code{B} to normalize.
#'
#' @return A matrix of the eigenvectors or left singular vectors in \code{B}
#'    transformed to be the left singular values of the original data matrix.
#'
#' @details This function is designed to reconstruct the original first \code{d}
#'    left singular vectors of a data matrix from the first \code{d}
#'    eigenvectors of the Grammian of that data matrix. Basically, after the
#'    data matrix has been centred, the left singular vectors of that data
#'    matrix and the left singular vectors of the Grammian of that data matrix
#'    are equal up to a sign. This function reverses that sign so that the two
#'    sets of singular vectors are equal.
#'
#'    Consider the internal workings of the \code{\link{aespca}} function. This
#'    "sign flipping" changes the eigenvectors of \code{xtx} into the
#'    left singular vectors of \code{scale(X, , center = TRUE, scale = TRUE)}.
#'    Instead of calculating the Grammian, regularising it (by adding some small
#'    \eqn{\lambda} value to the diagonal), taking the SVD of the regularized
#'    Grammian, and extracting the first \eqn{d} eigenvectors, why don't we just
#'    extract the first \eqn{d} singular vectors directly from the scaled data
#'    matrix itself? The regularisation effect only inflates the singular- or
#'    eigen-values anyway, so it has no effect on the singular vectors in any
#'    way. Moreover, the \code{\link{aespca}} function does not even call for
#'    the eigen-values at all, so this whole process is supurfluous. The only
#'    wrinkle is adapting the \code{\link{lars.lsa}} and \code{\link{aespca}}
#'    functions to only operate on the data matrix.
#'
#'    Furthermore, the \code{\link[lars]{lars}} function \emph{can} take in the
#'    full data, instead of just a Grammian. As an enhancement, we should either
#'    update our copy of the lars function in \code{\link{lars.lsa}}, or make a
#'    call to the exported \code{\link[lars]{lars}} function. ENHANCEMENT.
#'
#' @keywords internal
#'
#' @seealso \code{\link{aespca}}; \code{\link{lars.lsa}};
#'    \code{\link{AESPCA_pVals}}
#'
#' @examples
#'   # DO NOT CALL THIS FUNCTION DIRECTLY.
#'   # Use AESPCA_pVals() instead
normalize <- function(B, d){
  # browser()

  # Square root of the column sums of B ^ 2 (the entries squared).
  normB <- sqrt(apply(B ^ 2, 2, sum))
  # Replace any 0 values with 1. Once again, why? The domain is [0, infty), so
  #   shouldn't we replace 0s with the minimum of the non-zero values? Why 1?
  normB[normB == 0] <- 1
  # Divide each column of B by its "norm" value
  B <- t(t(B) / normB)
  # Replace any really small "normed" B values with 0
  B[abs(B) < 1e-6] <- 0

  # This loop moves over columns: it swaps the signs of each value in a column
  #   if the first value in that column is < 0.
  for (i in seq_len(d)){
    # browser()

    # Check for non-"zero" values in the ith column
    tmp <- abs(B[, i]) > 1e-6
    # If we have any non-"zero" entries in the ith column, then
    if(sum(tmp) != 0){

      # we find the indices of these non-small values and
      tmp1 <- B[tmp, i]
      # reverse their sign if the sign of the first non-small entry in B[,i] is
      #   negative.
      B[, i] <- sign(tmp1[1]) * B[, i]

    }

    # END for
  }

  B
}

Any scripts or data that you put into this service are public.

pathwayPCA documentation built on Dec. 15, 2020, 6:14 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pathwayPCA
Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

R/aesPC_unknown_matrixNorm.R
In pathwayPCA: Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

Defines functions normalize

Documented in normalize

Try the pathwayPCA package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

pathwayPCA Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

R/aesPC_unknown_matrixNorm.R In pathwayPCA: Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

Defines functions normalize

Documented in normalize

Try the pathwayPCA package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

pathwayPCA
Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

R/aesPC_unknown_matrixNorm.R
In pathwayPCA: Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection