QCreport | R Documentation |
The ORFik QC uses the aligned files (usually bam files),
fastp and STAR log files
combined with annotation to create relevant statistics.
This report consists of several steps:
1. Convert bam file / Input files to ".ofst" format, if not already done.
This format is around 400x faster to use in R than the bam format.
Files are also outputted to R environment specified by envExp(df)
2. From this report you will get a summary csv table, with distribution of
aligned reads and overlap counts over transcript regions like:
leader, cds, trailer, lincRNAs, tRNAs, rRNAs, snoRNAs etc. It will be called
STATS.csv. And can be imported with QCstats
function.
3. It will also make correlation plots and meta coverage plots,
so you get a good understanding of how good the quality of your NGS
data production + aligner step were.
4. Count tables are produced, similar to HTseq count tables.
Over mrna, leader, cds and trailer separately. This tables
are stored as SummarizedExperiment
, for easy loading into
DEseq, conversion to normalized fpkm values,
or collapsing replicates in an experiment.
And can be imported with countTable
function.
Everything will be outputed in the directory of your NGS data,
inside the folder ./QC_STATS/, relative to data location in 'df'.
You can specify new out location with out.dir if you want.
To make a ORFik experiment, see ?ORFik::experiment
To see some normal mrna coverage profiles of different RNA-seq protocols:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4310221/figure/F6/
QCreport(
df,
out.dir = resFolder(df),
plot.ext = ".pdf",
create.ofst = TRUE,
complex.correlation.plots = TRUE,
library.names = bamVarName(df),
use_simplified_reads = TRUE,
BPPARAM = bpparam()
)
df |
an ORFik |
out.dir |
character, output directory, default:
|
plot.ext |
character, default: ".pdf". Alternatives: ".png" or ".jpg". Note that in pdf format the complex correlation plots become very slow to load! |
create.ofst |
logical, default TRUE. Create ".ofst" files from the input libraries, ofst is much faster to load in R, for later use. Stored in ./ofst/ folder relative to experiment main folder. |
complex.correlation.plots |
logical, default TRUE. Add in addition to simple correlation plot two computationally heavy dots + correlation plots. Useful for deeper analysis, but takes longer time to run, especially on low-quality gpu computers. Set to FALSE to skip these. |
library.names |
character, default: bamVarName(df). Names to load libraries as to environment and names to display in plots. |
use_simplified_reads |
logical, default TRUE. For count tables and coverage plots a speed up for GAlignments is to use 5' ends only. This will lose some detail for splice sites, but is usually irrelevant. Note: If reads are precollapsed GRanges, set to FALSE to avoid recollapsing. |
BPPARAM |
how many cores/threads to use? default: bpparam().
To see number of threads used, do |
invisible(NULL) (objects are stored to disc)
Other QC report:
QCplots()
,
QCstats()
# Load an experiment
df <- ORFik.template.experiment()
# Run QC
#QCreport(df, tempdir())
# QC on subset
#QCreport(df[9,], tempdir())
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.