R/ScpModel-Utils.R

Defines functions .getPCs addReducedDims scpRemoveBatchEffect scpKeepEffect scpAnnotateResults

Documented in addReducedDims scpAnnotateResults scpKeepEffect scpRemoveBatchEffect

##' @title Annotate single-cell proteomics analysis output
##'
##' @description
##'
##' The function takes as input a list of `DFrame` and a table with
##' additional annotations. The annotation tables is automatically
##' merged into all tables of the list by matching the specified
##' columns (given by the arguments `by` and `by2`). This function is
##' useful to add annotation to analysis results generated by
##' `scpVarianceAnalysis()`, `scpDifferentialAnalysis()`, or
##' `scpComponentAnalysis()`. The annotation table is typically the
##' `colData` or `rowData` of the object used for modelling. In case
##' of shared column names between the input tables and the annotation
##' table, any annotation that is already present in the list of
##' tables will be overwritten by the new annotations.
##'
##' @param tableList A list of tables, typically the output of
##'     `scpVarianceAnalysis()`, `scpDifferentialAnalysis()`, or the
##'     the `bySample` or `byFeature` elements returned by
##'     `scpComponentAnalysis()`.
##'
##' @param annotations A table of class 'data.frame' or 'DFrame'
##'     containing the annotations to add. If no further arguments are
##'     provided, the table must have row names.
##'
##' @param by A `character(1)` providing the name of the column in the
##'     tables in `tableList` to use to match the rows of the
##'     annotation table.
##'
##' @param by2 A `character(1)` providing the name of the column in the
##'     annotation table to use to match the rows of the tables in
##'     `tableList`. If `NULL`, it will be defined by `by`. The column
##'     pointed by `by2` will be dropped in the output tables.
##'
##' @seealso
##'
##' - [ScpModel-VarianceAnalysis]
##' - [ScpModel-DifferentialAnalysis]
##' - [ScpModel-ComponentAnalysis]
##'
##' @author Christophe Vanderaa, Laurent Gatto
##'
##' @example inst/examples/examples_ScpModel-Annotation.R
##'
##' @export
scpAnnotateResults <- function(tableList,
                               annotations, by, by2 = NULL) {
    allowedClasses <- c("DFrame", "data.frame")
    if (!all(sapply(tableList, inherits, allowedClasses)) | !length(tableList)) {
        stop(
            "'tableList' must be a list of '",
            paste(allowedClasses, collapse = "' or '"), "'."
        )
    }
    if (!all(sapply(tableList, function(x) by %in% colnames(x)))) {
        stop("'", by, "' not found in the columns of 'tableList' elements.")
    }
    if (is.null(by2)) by2 <- by
    if (!by2 %in% colnames(annotations))
        stop("'", by2, "' not found in 'annotations'.")
    endoapply(tableList, function(x) {
        matchInd <- match(x[[by]], annotations[[by2]])
        selCols <- colnames(annotations) != by2
        annotations <- annotations[matchInd, selCols, drop = FALSE]
        x[, colnames(annotations)] <- annotations
        x
    })
}


##' @name ScpModel-DataCorrection
##'
##' @title Correct single-cell proteomics data
##'
##' @description
##'
##' The function uses the data modelling output to generate corrected
##' data that can be used for downstream analysis. The input
##' is expected to be a `SingleCellExperiment` object that contains an
##' estimated `ScpModel`. There are two approaches:
##'
##' - `scpKeepEffect()`: keep the effects of interests. The
##'   reconstructed data is the sum of the effect matrices for the
##'   variable of interest and the residuals. Note that the intercepts
##'   (baseline intensity of each feature) are not included by
##'   default, but they can be added when `intercept = TRUE`.
##' - `scpRemoveBatchEffect()`: remove any undesired effect. The batch
##'   corrected data is the input data minus the effect matrices that
##'   correspond to batch effect variables. Note that the intercepts
##'   (baseline intensity of each feature) are removed by default, but
##'   they can be kept when `intercept = FALSE`.
##'
##' Despite the two approaches are conceptually different, they can
##' lead to similar results if the effects that are used to
##' reconstruct the data are the ones that are not removed when
##' performing batch correction (see examples).
##'
##' The function returns a new `SingleCellExperiment` that contains an
##' assay with the batch corrected data. Note that the 'ScpModel` is
##' erased in this new object.
##'
##' @seealso
##'
##' - [ScpModel-class] for functions to extract information from the
##'   `ScpModel` object
##' - [ScpModel-Workflow] to run a model on SCP data required for
##'   batch correction.
##'
##' @author Christophe Vanderaa, Laurent Gatto
##'
##' @example inst/examples/examples_ScpModel-DataCorrection.R
##'
NULL

##' @name ScpModel-DataCorrection
##'
##' @param object An object that inherits from the
##'     `SingleCellExperiment` class. It must contain an estimated
##'     `ScpModel` in its metadata
##'
##' @param effects A `character()` vector. For `scpKeepEffect()`,
##'     which model variable should be used to reconstruct the data.
##'     For `scpRemoveBatchEffect()`, which model variable should be
##'     removed from the data. When `NULL` (default), both functions
##'     return the model residuals.
##'
##' @param intercept A `logical(1)`. For `scpKeepEffect()`,
##'     should the intercepts be included when reconstructing the
##'     data? Defaults to `FALSE`, hence the intercepts are not
##'     included. For `scpRemoveBatchEffect()`, should the intercepts
##'     be removed from the data? Defaults to `TRUE`, hence the
##'     intercepts are removed from the data.
##'
##' @param name A `character(1)` providing the name to use to retrieve
##'     the model results. When retrieving a model and `name` is
##'     missing, the name of the first model found in `object` is used.
##'
##' @importFrom SingleCellExperiment reducedDims<-
##' @importFrom SummarizedExperiment assays<-
##'
##' @export
scpKeepEffect <- function(object, effects = NULL,
                          intercept = FALSE, name) {
    new <- scpModelResiduals(object, name)
    if (!is.null(effects)) {
        stopifnot(effects %in% scpModelEffectNames(object, name))
        allEffects <- scpModelEffects(object, name)
        for (e in effects) {
            new <- new + allEffects[[e]]
        }
    }
    if (intercept) {
        new <- new + scpModelIntercept(object, name)
    }
    object <- object[scpModelFeatureNames(object, name), ]
    assays(object) <- List(new)
    m <- metadata(object)
    metadata(object) <- m[!names(m) %in% scpModelNames(object)]
    if (inherits(object, "SingleCellExperiment"))
        reducedDims(object) <- List()
    object
}

##' @name ScpModel-DataCorrection
##'
##' @export
scpRemoveBatchEffect <- function(object, effects = NULL,
                                 intercept = TRUE, name) {
    stopifnot(effects %in% scpModelEffectNames(object, name))
    if (!is.null(effects)) {
        allEffects <- scpModelEffectNames(object, name)
        effects <- allEffects[!allEffects %in% effects]
    }
    scpKeepEffect(object, effects, !intercept, name)
}

##' @title Add scplainer Component Analysis Results
##'
##' @description The function will add the component results computed
##'     by [scpComponentAnalysis()] to a `SingleCellExperiment`'s
##'     `reducedDims` slot, to all using the many `scater` functions,
##'     such as [scater::plotReducedDim()], [scater::plotTSNE()], ...
##'
##' @param sce An instance of class [SingleCellExperiment].
##'
##' @param x A `List` of `DataFrames` containing principal components.
##'     This list is typically the `bySample` element produced by
##'     [scpComponentAnalysis()].
##'
##' @return A `SingleCellExperiment` with updated `reducedDims`.
##'
##' @author Laurent Gatto and Christophe Vanderaa
##'
##' @export
##'
##' @examples
##'
##' library("scater")
##' data("leduc_minimal")
##' pcs <- scpComponentAnalysis(
##'    leduc_minimal, method = "ASCA",
##'    effects = "SampleType")$bySample
##'
##' reducedDims(leduc_minimal)
##' leduc_minimal <- addReducedDims(leduc_minimal, pcs)
##' reducedDims(leduc_minimal)
##' plotReducedDim(leduc_minimal, dimred = "ASCA_SampleType",
##'                colour_by = "SampleType")
##' leduc_minimal <- runTSNE(leduc_minimal, dimred = "ASCA_SampleType")
##' plotTSNE(leduc_minimal, colour_by = "SampleType")
addReducedDims <- function(sce, x) {
    if (!inherits(sce, "SingleCellExperiment"))
        stop(
            "'sce' must be a SingleCellExperiment object. Transform ",
            "your data using 'as(sce, \"SingleCellExperiment\")'."
        )
    pcList <- List(lapply(x, .getPCs))
    reducedDims(sce) <- c(reducedDims(sce), pcList)
    sce
}

## Internal function that extracts the principal components from a
## table as computed by scpComponentAnalysis(). The function returns a
## matrix with the PCs where the proportion of variance is stored as
## an attribute.
##
## Expected conditions that the table must meet for .getPCs():
## - At least one column starting with "PC"
## - The table must be a DataFrame
## - The table must a metadata slot names proportionVariance
## - All PC names in the table must be present in the elements of the
##   metadata "proportionVariance".
##
## Note that output from scpComponentAnalysis(...)$bySample is always
## valid.
##
## @param x A data.frame with principal components stored in columns
##     starting with "PC".
##
.getPCs <- function(x) {
    pcNames <- grep("^PC", colnames(x), value = TRUE)
    isValid <- length(pcNames) && inherits(x, "DataFrame") &&
        "proportionVariance" %in% names(metadata(x)) &&
        all(pcNames %in% names(metadata(x)$proportionVariance))
    if (!isValid)
        stop("Invalid table(s). Make sure you provided the 'bySample' ",
             "element returned by 'scpComponentAnalysis()'.")
    pcs <- as.matrix(x[, pcNames, drop = FALSE])
    attr(pcs, "proportionVariance") <-
        metadata(x)$proportionVariance[pcNames]
    pcs
}
UCLouvain-CBIO/scp documentation built on Oct. 12, 2024, 2:37 a.m.