stabiliseVariance: Function to reduce variance of LDRs as a function of...

View source: R/stabiliseVariance.R

stabiliseVarianceR Documentation

Function to reduce variance of LDRs as a function of coverage.

Description

This function computes the log drop-off rate ratios (LDRs) for control-control and treatment-control comparisons and applies a transformation to them that reduces their variance as a function of coverage.

Usage

    stabiliseVariance(se, nuclSelection, Nc, Nt)

Arguments

se

A SummarizedExperiment object storing structure probing data and the associated genomic sequence. The documentation for the example data set provided with the package se outlines how the object should be defined. stabiliseVariance uses the assays "coverage" and "dropoff_rate".

nuclSelection

A list returned by selectNuclPos, containing the positions of nucleotides selected for all control-control and treatment-control comparisons.

Nc

Number of control experimental replicates. Must be at least 2.

Nt

Number of treatment experimental replicates. Must be at least 2.

Details

The variance is reduced by sorting all LDRs in the null distribution by the average coverage and splitting the data in bins. Each bin spans a coverage of 100; or, if maximum coverage is not larger by 100 than the minimum coverage, the range is set to their difference divided by 10.

For each bin, the 95th quantile of LDRs with subtracted mean and the average coverage are computed. Then non-linear least squares are used to fit the following model (with parameters k, b): f = 1/sqrt(n) * k + b, for f - quantiles, n - mean coverage in the bin.

All LDRs are then rescaled by this model according to their corresponding average coverage in the pair of replicates.

Value

LDR_C

A matrix of transformed LDRs for control-control comparisons. The matrix rows correspond to nucleotide positions and columns to a control-control comparison. Only those positions selected for a pair-wise comparison will be assigned a value; the rest will be left as an NA.

LDR_CT

A matrix of transformed LDRs for treatment-control comparisons. The matrix rows correspond to nucleotide positions and columns to a treatment-control comparison. Only those positions selected for a pair-wise comparison will be assigned a value; the rest will be left as an NA.

Error

The following errors are returned if:

"Number of control and treatment replicates must be at least 2." the number of control or treatment experimental replicates is less than 2;

"All lists of positions selected for pair-wise comparisons should be non-empty." a list of nucleotides for control-control or treatment-control comparisons is empty;

"The coverage and drop-off count matrices should not have NA entries." the coverage and drop-off count matrices have NA entries;

"Unable to fit the model for correcting the coverage bias." The function nls could not execute successfully. The 95th quantiles of the LDR distribution in each bin could be equal to 0 or not have enough elements. This would happen if not enough nucleotides end up in a bin; e.g. one nucleotide per bin.

Author(s)

Alina Selega, Sander Granneman, Guido Sanguinetti

References

Selega et al. "Robust statistical modeling improves sensitivity of high-throughput RNA structure probing experiments", Nature Methods (2016).

See Also

See Also selectNuclPos.

Examples

    library(SummarizedExperiment)
    Nc <- 3
    Nt <- 3
    t <- 1
    nuclSelection <- selectNuclPos(se, Nc, Nt, t)
    assay(se, "dropoff_rate") <- scaleDOR(se, nuclSelection, Nc, Nt)
    varStab <- stabiliseVariance(se, nuclSelection, Nc, Nt)

alinaselega/BUMHMM documentation built on March 2, 2024, 10 p.m.