assay-functions: Create simplified representation of ragged assay data.

assay-functionsR Documentation

Create simplified representation of ragged assay data.

Description

These methods transform assay() from the default (i.e., sparseAssay()) representation to various forms of more dense representation. compactAssay() collapses identical ranges across samples into a single row. disjoinAssay() creates disjoint (non-overlapping) regions, simplifies values within each sample in a user-specified manner, and returns a matrix of disjoint regions x samples.

This method transforms assay() from the default (i.e., sparseAssay()) representation to a reduced representation summarizing each original range overlapping ranges in query. Reduction in each cell can be tailored to indivdual needs using the simplifyReduce functional argument.

Usage

sparseAssay(
  x,
  i = 1,
  withDimnames = TRUE,
  background = NA_integer_,
  sparse = FALSE
)

compactAssay(
  x,
  i = 1,
  withDimnames = TRUE,
  background = NA_integer_,
  sparse = FALSE
)

disjoinAssay(
  x,
  simplifyDisjoin,
  i = 1,
  withDimnames = TRUE,
  background = NA_integer_
)

qreduceAssay(
  x,
  query,
  simplifyReduce,
  i = 1,
  withDimnames = TRUE,
  background = NA_integer_
)

Arguments

x

A RaggedExperiment object

i

integer(1) or character(1) name of assay to be transformed.

withDimnames

logical(1) include dimnames on the returned matrix. When there are no explict rownames, these are manufactured with as.character(rowRanges(x)); rownames are always manufactured for compactAssay() and disjoinAssay().

background

A value (default NA) for the returned matrix after *Assay operations

sparse

logical(1) whether to return a sparseMatrix representation

simplifyDisjoin

A function / functional operating on a *List, where the elements of the list are all within-sample assay values from ranges overlapping each disjoint range. For instance, to use the simplifyDisjoin=mean of overlapping ranges, where ranges are characterized by integer-valued scores, the entries are calculated as

                    a
    original: |-----------|
                        b
                   |----------|

                a    a, b   b
    disjoint: |----|------|---|

    values <- IntegerList(a, c(a, b), b)
    simplifyDisjoin(values)
    
query

GRanges providing regions over which reduction is to occur.

simplifyReduce

A function / functional accepting arguments score, range, and qrange:

  • score A *List, where each list element corresponds to a cell in the matrix to be returned by qreduceAssay. Vector elements correspond to ranges overlapping query. The *List objects support many vectorized mathematical operations, so simplifyReduce can be implemented efficiently.

  • range A GRangesList instance, 'parallel' to score. Each element of the list corresponds to a cell in the matrix to be returned by qreduceAssay. Each range in the element corresponds to the range for which the score element applies.

  • qrange A GRanges instance with the same length as unlist(score), providing the query range window to which the corresponding scores apply.

Value

sparseAssay(): A matrix() with dimensions dim(x). Elements contain the assay value for the ith range and jth sample. Use 'sparse=TRUE' to obtain a sparseMatrix assay representation.

compactAssay(): Samples with identical range are placed in the same row. Non-disjoint ranges are NOT collapsed. Use 'sparse=TRUE' to obtain a sparseMatrix assay representation.

disjoinAssay(): A matrix with number of rows equal to number of disjoint ranges across all samples. Elements of the matrix are summarized by applying simplifyDisjoin() to assay values of overlapping ranges

qreduceAssay(): A matrix() with dimensions length(query) x ncol(x). Elements contain assay values for the ith query range and jth sample, summarized according to the function simplifyReduce.

Examples

re4 <- RaggedExperiment(GRangesList(
    GRanges(c(A = "chr1:1-10:-", B = "chr1:8-14:-", C = "chr2:15-18:+"),
        score = 3:5),
    GRanges(c(D = "chr1:1-10:-", E = "chr2:11-18:+"), score = 1:2)
), colData = DataFrame(id = 1:2))

query <- GRanges(c("chr1:1-14:-", "chr2:11-18:+"))

weightedmean <- function(scores, ranges, qranges)
{
    ## weighted average score per query range
    ## the weight corresponds to the size of the overlap of each
    ## overlapping subject range with the corresponding query range
    isects <- pintersect(ranges, qranges)
    sum(scores * width(isects)) / sum(width(isects))
}

qreduceAssay(re4, query, weightedmean)

## Not run: 
    ## Extended example: non-silent mutations, summarized by genic
    ## region
    suppressPackageStartupMessages({
        library(TxDb.Hsapiens.UCSC.hg19.knownGene)
        library(org.Hs.eg.db)
        library(GenomeInfoDb)
        library(MultiAssayExperiment)
        library(curatedTCGAData)
        library(TCGAutils)
    })

    ## TCGA MultiAssayExperiment with RaggedExperiment data
    mae <- curatedTCGAData("ACC", c("RNASeq2GeneNorm", "CNASNP", "Mutation"),
        version = "1.1.38", dry.run = FALSE)

    ## genomic coordinates
    gn <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene)
    gn <- keepStandardChromosomes(granges(gn), pruning.mode="coarse")
    seqlevelsStyle(gn) <- "NCBI"
    genome(gn)
    gn <- unstrand(gn)

    ## reduce mutations, marking any genomic range with non-silent
    ## mutation as FALSE
    nonsilent <- function(scores, ranges, qranges)
        any(scores != "Silent")
    mre <- mae[["ACC_Mutation-20160128"]]
    seqlevelsStyle(rowRanges(mre)) <- "NCBI"
    ## hack to make genomes match
    genome(mre) <- paste0(correctBuild(unique(genome(mre)), "NCBI"), ".p13")
    mutations <- qreduceAssay(mre, gn, nonsilent, "Variant_Classification")
    genome(mre) <- correctBuild(unique(genome(mre)), "NCBI")

    ## reduce copy number
    re <- mae[["ACC_CNASNP-20160128"]]
    class(re)
    ## [1] "RaggedExperiment"
    seqlevelsStyle(re) <- "NCBI"
    genome(re) <- "GRCh37.p13"
    cn <- qreduceAssay(re, gn, weightedmean, "Segment_Mean")
    genome(re) <- "GRCh37"

    ## ALTERNATIVE
    ##
    ## TCGAutils helper function to convert RaggedExperiment objects to
    ## RangedSummarizedExperiment based on annotated gene ranges
    mae2 <- mae
    mae2[[1L]] <- re
    mae2[[2L]] <- mre
    qreduceTCGA(mae2)

## End(Not run)

Bioconductor/RaggedExperiment documentation built on Nov. 25, 2024, 2:01 a.m.