Description Usage Arguments Details Value References See Also Examples
View source: R/anota2seqResidOutlierTest.R
One assumption when performing APV is that the residuals from the regressions are normally distributed. anota2seq assesses this by comparing the Q-Q plots of the residuals to envelopes derived by sampling from the normal distribution.
1 2 3 | anota2seqResidOutlierTest(Anota2seqDataSet, confInt = 0.01,
iter = 5, generateSingleGenePlots = FALSE, nGraphs = 200,
generateSummaryPlot = TRUE, residFitPlot = TRUE, useProgBar = TRUE)
|
Anota2seqDataSet |
An object of class Anota2seqDataSet that also contains the output of the anota2seqPerformQC function. |
confInt |
Controls how many samples from the normal distribution will be used to generate the envelope to which the residuals are compared. Default is 0.01 which will generate 99 samples from the normal distribution to compare to the actual residuals. |
iter |
How many times should the analysis be performed? Default is 5 meaning that 5 sets of samples (each with the size controlled by confInt) will be generated. Notice that the summary plotting is only performed for the last set but the percentage of outliers for each iteration can be found in the output object. |
generateSingleGenePlots |
The analysis is performed per identifier and plots can be generated for each identifier. However, due to the high number of identifiers, a large number of plots will typically be generated. TRUE/FALSE with default FALSE. |
nGraphs |
If generateSingleGenePlots is set to TRUE, nGraphs controls for how many identifiers such single gene graphs will be generated. Default is 200. NOTE: this parameter plots the top "n" genes in the same order as the input data. |
generateSummaryPlot |
The function can generate a summary graph that shows the envelopes generated by sampling from the normal distribution compared to the obtained values for all genes. Default is TRUE, thus the graph is generated but only from the last iteration. |
residFitPlot |
Generates an output of the fitted values and residuals. Default is TRUE, generate the plot. |
useProgBar |
Should the progress bar be shown. Default is TRUE, show progress bar. |
The anota2seqResidOutlierTest function assesses whether the residuals from the per identifier linear regressions of translated mRNA level~total mRNA level+treatment are normally distributed. anota2seq generates normal Q-Q plots of the residuals. If the residuals are normally distributed, the data quantiles will form a straight diagonal line from bottom left to top right. Because there are typically relatively few data points, anota2seq calculates "envelopes" based on a set of samplings from the normal distribution using the same number of data points as for the true data Venables,Ripley.To enable a comparison both the actual and the sampled data are centered (mean=0) and scaled (sd=1). The data (both true and sampled) are then sorted and the true sample is compared to the envelopes of the sampled data at each sort position. The result is presented as a Q-Q plot of the true data where the envelopes of the sampled data are indicated. If there are 99 samplings we expect that 1/100 values to be outside the envelopes obtained from the samplings. Thus it is possible to assess if approximately the expected number of outlier residuals are obtained. The result is presented as both a graphical output and an output object.
An Anota2seqDataSet. anota2seqResidOutlierTest saves its output
data in the 'residOutlierTest' slot of the Anota2seqDataSet, see
anota2seqGetResidOutlierTest
for a detailed description
of this output.
anota2seqResidOutlierTest also generates a graphical output ("ANOTA2SEQ_residual_distribution_summary.pdf") showing the Q-Q plots from all genes as well as the envelopes from the sampled data. The obtained percentage of outliers is shown at each rank position and all combined. Optionally, when generateSingleGenePlots is set to TRUE, the function also generates individual plots (stored as "ANOTA2SEQ_residual_distributions_single.pdf") for n genes (set by nGraphs). When residFitPlot is set to TRUE an output comparing the fitted values to the residuals is generated (stored as "ANOTA2SEQ_residuals_vs_fitted.jpeg").
Venables, W.N. and Ripley, B.D., Modern Applied Statistics with S-PLUS, springer (1999).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ## Not run:
data(anota2seq_data)
# Initialize Anota2seqDataSet
Anota2seqDataSet <- anota2seqDataSetFromMatrix(
dataP = anota2seq_data_P[1:100,],
dataT = anota2seq_data_T[1:100,],
phenoVec = anota2seq_pheno_vec,
dataType = "RNAseq",
normalize = TRUE)
# Perform anota2seqPerformQC function. This must be performed prior the running
# the anota2seqResidualOutlierTest function.
Anota2seqDataSet <- anota2seqPerformQC(Anota2seqDataSet)
# Perform anota2seqResidualOutlierTest function
Anota2seqDataSet <- anota2seqResidOutlierTest(Anota2seqDataSet)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.