isolde_test: Statistical analysis of Allele specific read (ASR) counts

Description Usage Arguments Details Value Note Note Author(s) References Examples

View source: R/isolde_test.R

Description

The main function of the ISoLDE package. Performs statistical test to identify genes with allelic bias and produces both graphical and textual outputs.

Usage

1
2
3
4
  isolde_test(bias, method = "default", asr_counts, target,
             nboot = 5000, pcore = 75, graph = TRUE, ext = "pdf",
             text = TRUE, split_files = FALSE, prefix =
             "ISoLDE_result", outdir = "")

Arguments

bias

The kind of bias you want to study. It must be one of “parental” or “strain”.

method

specifies the statistical method to use for testing. It must be one of “default” or “threshold”. Default behaviour is to adapt to the number of replicates: when at least three biological replicates for each reciprocal cross are available the bootstrap resampling method is used, else the threshold method is applied. It is possible to force isolde_test to use the threshold method even when more than three replicates are available. In this case method must be set to “threshold”. It is *not possible* to force a bootstrap method with less than three replicates.

asr_counts

the data.frame containing the ASR counts to be tested. These data should be normalized and filtered (see the filterT function), although the function can run with non-normalized and non-filtered data (not recommended).

target

the target data.frame (obtained by the readTarget function).

nboot

specifies how many resampling steps to do for the bootstrap method. This option is not considered if “threshold” value is set for method. Low values of nboot leads to less relevent results (default to 5000).

pcore

a value between 0 and 100 (default to 75) which specifies the proportion of cores (in percent) to be used for the bootstrap method.

graph

if TRUE (default) graphical outputs are produced (both on device and file).

ext

specifies the extension of the graphical file output (does not work if graph = FALSE). It must be one of “pdf” (default), “png” or “eps”.

text

if TRUE (default) textual output files are produced.

split_files

if text = TRUE, specifies if your want to have all genes in one same output file (FALSE, default) or four separate files according to the result: ASE, biallelic, undetermined or filtered (TRUE).

prefix

specifies the prefix for all output file names (default to "ISoLDE_result").

outdir

specifies the path where to write the output file(s) (default to current directory).

Details

Before using this function, your data should be normalized and filtered (see the filterT function for filtering) although the function can run with non-normalized and/or non-filtered data.

The method depends on your minimum number of replicates for each reciprocal cross.

If only one replicate is found, the test can not be achieved and exits.

method=“default” : If more than two replicates per cross, the method takes advantage of having enough information by using bootstrap resampling to identify genes with allelic bias.

If only two replicates are found in at least one cross, there is too few information to obtain reliable distributions from resampling. Genes with allelic bias are identified thanks to empirically defined thresholds.

method=“threshold” : The empirical method will be processed instead of the bootstrap one, even if more than two replicates per cross are found.

Note that in differential RNA-seq analysis, at least three replicates are strongly recommended, as variability estimation quality is a key factor in statistical analysis.

More details in Reynès, C. et al. (2016) ISoLDE: a new method for identification of allelic imbalance. Submitted

Value

listASE

a data.frame with one row per gene (or transcript) identified as having an allelic bias and five columns:
- “names” contains gene (or transcript) names such as asr_counts row
names,
- “criterion” contains the criterion value (see vignette or Reynès et al. (2016)),
- “diff_prop” the criterion numerator which contains the difference between proportions of either parents or strain origins,
- “variability” the criterion denominator which quantifies the gene (or transcript) variability between replicates,
- “origin” specifies the bias direction either "P" or "M" for parental bias or one of specified strain names for strain bias.

listBA

a data.frame with one row per gene (or transcript) identified as biallelically expressed and four columns corresponding to the first four ones in listASE.

listUN

a data.frame with one row per gene (or transcript) with undetermined status and six columns. The first five columns are the same as listASE, the last one may take three values:
- “FLAG_consistency” for genes no statistical evidence of neither bias nor biallelic expression but whose parental or strain bias is always in the same direction across replicates,
- “FLAG_significance” for genes with statistical evidence of bias but with discrepancies in bias direction across replicates,
- “NO_FLAG” for other undetermined genes.

listFILT

a data.frame containing names of genes that have failed the minimal filtering step and thus that have not been considered during the statistical test.

ASE, BA and UN lists are sorted according to their criterion value.

Note

The bootstrap resampling step is performed many times (default to 5000). Hence, the function may run for a long time if performing the bootstrap method (until several minutes).

Note

A minimal filtering step will always be performed while applying the isolde_test function. It consists of eliminating all genes not satisfying these two conditions:
- At least one of the two medians (of paternal or maternal ASR counts) is different from 0;
- There is at least one ASR count (different from 0) in each cross.

Author(s)

Christelle Reynès christelle.reynes@igf.cnrs.fr,
Marine Rohmer marine.rohmer@mgx.cnrs.fr

References

Reynès, C. et al. (2016): ISoLDE: a new method for identification of allelic imbalance. Submitted

Examples

1
2
3
4
5
6
7
  # Loading all required data.frames
  data(filteredASRcounts)
  data(target)
  # Statistical analysis (forcing the threshold option)
  isolde_res <- isolde_test(bias = "parental", method = "threshold", 
asr_counts = filteredASRcounts, target = target, ext = "pdf",
prefix = "ISoLDE_test")

ChristelleReynes/ISoLDE documentation built on Dec. 31, 2020, 10:59 a.m.