r Biocpkg('regionReport')
R
is an open-source statistical environment which can be easily modified to enhance its functionality via packages. r Biocpkg('regionReport')
is a R
package available via the Bioconductor repository for packages. R
can be installed on any operating system from CRAN after which you can install r Biocpkg('regionReport')
by using the following commands in your R
session:
if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } BiocManager::install("regionReport") ## Check that you have a valid Bioconductor installation BiocManager::valid()
r Biocpkg('regionReport')
is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with RNA-seq data. That is, packages like r Biocpkg('GenomicFeatures')
that allow you to import the data and r Biocpkg('DESeq2')
for generating differential expression results. A r Biocpkg('regionReport')
user is not expected to deal with those packages directly.
If you are asking yourself the question "Where do I start using Bioconductor?" you might be interested in this blog post.
As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R
and Bioconductor
have a steep learning curve so it is critical to learn where to ask for help. The blog post quoted above mentions some but we would like to highlight the Bioconductor support site as the main resource for getting help: remember to use the regionReport
tag and check the older posts. Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.
r Biocpkg('regionReport')
We hope that r Biocpkg('regionReport')
will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!
## Citation info citation("regionReport")
## Track time spent on making the vignette startTimeVignette <- Sys.time() ## Bib setup library("RefManageR") ## Write bibliography information bib <- c( derfinder = citation("derfinder")[1], regionReport = citation("regionReport")[1], knitrBootstrap = citation("knitrBootstrap")[1], BiocStyle = citation("BiocStyle")[1], ggbio = citation("ggbio")[1], ggplot2 = citation("ggplot2")[1], knitr = citation("knitr")[3], RefManageR = citation("RefManageR")[1], rmarkdown = citation("rmarkdown")[1], DT = citation("DT")[1], R = citation(), IRanges = citation("IRanges")[1], sessioninfo = citation("sessioninfo")[1], GenomeInfoDb = RefManageR::BibEntry( bibtype = "manual", key = "GenomeInfoDb", author = "Sonali Arora and Martin Morgan and Marc Carlson and H. Pagès", title = "GenomeInfoDb: Utilities for manipulating chromosome and other 'seqname' identifiers", year = 2017, doi = "10.18129/B9.bioc.GenomeInfoDb" ), GenomicRanges = citation("GenomicRanges")[1], biovizBase = citation("biovizBase")[1], TxDb.Hsapiens.UCSC.hg19.knownGene = citation("TxDb.Hsapiens.UCSC.hg19.knownGene")[1], derfinderPlot = citation("derfinderPlot")[1], grid = citation("grid")[1], gridExtra = citation("gridExtra")[1], mgcv = citation("mgcv")[1], RColorBrewer = citation("RColorBrewer")[1], whisker = citation("whisker")[1], bumphunter = citation("bumphunter")[1], pheatmap = citation("pheatmap")[1], DESeq2 = citation("DESeq2")[1], edgeR1 = citation("edgeR")[1], edgeR2 = citation("edgeR")[2], DEFormats = citation("DEFormats")[1] )
r Biocpkg('regionReport')
r Citep(bib[['regionReport']])
creates HTML or PDF reports for a set of genomic regions such as r Biocpkg('derfinder')
r Citep(bib[['derfinder']])
results or for feature-level analyses performed with r Biocpkg('DESeq2')
r Citep(bib[['DESeq2']])
or r Biocpkg('edgeR')
r Citep(bib[c('edgeR1', 'edgeR2')])
. The HTML reports are styled with r CRANpkg('rmarkdown')
r Citep(bib[['rmarkdown']])
by default but can optionally be styled with r CRANpkg('knitrBootstrap')
r Citep(bib[['knitrBootstrap']])
.
This package includes a basic exploration for a general set of genomic regions which can be easily customized to include the appropriate conclusions and/or further exploration of the results. Such a report can be generated using renderReport()
. r Biocpkg('regionReport')
has a separate template for running a basic exploration analysis of r Biocpkg('derfinder')
results by using derfinderReport()
. derfinderReport()
is specific to single base-level approach r Biocpkg('derfinder')
results. A third template is included for exploring r Biocpkg('DESeq2')
or r Biocpkg('edgeR')
differential expression results.
All reports are written in R Markdown
format and include all the code for making the plots and explorations in the report itself. For all templates, r Biocpkg('regionReport')
relies on
r CRANpkg('knitr')
r Citep(bib[['knitr']])
, r CRANpkg('rmarkdown')
r Citep(bib[['rmarkdown']])
, DT
r Citep(bib[['DT']])
and optionally r CRANpkg('knitrBootstrap')
r Citep(bib[['knitrBootstrap']])
for generating the report. The reports can be either in HTML or PDF format and can be easily customized.
r Biocpkg('regionReport')
for r Biocpkg('DESeq2')
resultsThe plots in r Biocpkg('regionReport')
for exploring r Biocpkg('DESeq2')
are powered by r CRANpkg('ggplot2')
r Citep(bib[['ggplot2']])
and r CRANpkg('pheatmap')
r Citep(bib[['pheatmap']])
.
The r Biocpkg('regionReport')
supplementary website regionReportSupp has examples of using r Biocpkg('regionReport')
with r Biocpkg('DESeq2')
results. In particular, please look at DESeq2.html which has the code for generating some r Biocpkg('DESeq2')
results based on the r Biocpkg('DESeq2')
vignette. Then it uses those results to create HTML and PDF versions of the same report. The resulting reports are available in the following locations:
Note that in both examples we changed the r CRANpkg('ggplot2')
theme to theme_bw()
. Also, in the PDF version we used the option device = 'pdf'
instead of the default device = 'png'
in DESeq2Report()
since PDF figures are more appropriate for PDF reports: they look better than PNG figures.
If you want to create a similar HTML report as the one linked in this section, simply run example('DESeq2Report', 'regionReport', ask=FALSE)
. The only difference will be the r CRANpkg('ggplot2')
theme for the plots.
r Biocpkg('regionReport')
for r Biocpkg('edgeR')
resultsr Biocpkg('regionReport')
has the edgeReport()
function that takes as input a DGEList
object and the results from the differential expression analysis using r Biocpkg('edgeR')
. edgeReport()
internally uses r Biocpkg('DEFormats')
to convert the results to r Biocpkg('DESeq2')
's format and then uses DESeqReport()
to create the final report. The report looks nearly the same whether you performed the differential expression analysis with r Biocpkg('DESeq2')
or r Biocpkg('edgeR')
in order to make more homogenous the exploratory data analysis step.
The r Biocpkg('regionReport')
supplementary website regionReportSupp has examples of using r Biocpkg('regionReport')
with r Biocpkg('edgeR')
results. In particular, please look at edgeR.html which has the code for generating some random data with r Biocpkg('DEFormats')
and performing the differential expression analysis with r Biocpkg('edgeR')
. Then it uses those results to create HTML and PDF versions of the same report. The resulting reports are available in the following locations:
Note that in both examples we changed the r CRANpkg('ggplot2')
theme to theme_linedraw()
. Also, in the PDF version we used the option device = 'pdf'
instead of the default device = 'png'
in edgeReport()
since PDF figures are more appropriate for PDF reports: they look better than PNG figures.
If you want to create a similar HTML report as the one linked in this section, simply run example('edgeReport', 'regionReport', ask=FALSE)
. The only difference will be the r CRANpkg('ggplot2')
theme for the plots and the amount of data simulated with r Biocpkg('DEFormats')
.
r Biocpkg('regionReport')
for region resultsThe plots in r Biocpkg('regionReport')
for region reports are powered by r Biocpkg('derfinderPlot')
r Citep(bib[['derfinderPlot']])
, r Biocpkg('ggbio')
r Citep(bib[['ggbio']])
, and r CRANpkg('ggplot2')
r Citep(bib[['ggplot2']])
.
The r Biocpkg('regionReport')
supplementary website regionReportSupp has examples of using r Biocpkg('regionReport')
with results from r Biocpkg('DiffBind')
and r Biocpkg('derfinder')
. Included as a vignette, this package also has an example using a small data set derived from r Biocpkg('bumphunter')
. These represent different uses of r Biocpkg('regionReport')
for results from ChIP-seq, methylation, and RNA-seq data. In particular, the r Biocpkg('DiffBind')
example illustrates how to expand a basic report created with renderReport()
.
For a general use case, you first have to identify a set of genomic regions of interest and store it as a GRanges
object. In a typical workflow you will have some variables measured for each of the regions, such as p-values and scores. renderReport()
uses the set of regions and three main arguments:
pvalueVars
: this is a character vector (named optionally) with the names of the variables that are bound between 0 and 1, such as p-values. For each of these variables, renderReport()
explores the distribution by chromosome, the overall distribution, and makes a table with commonly used cutoffs.densityVars
: is another character vector (named optionally) with another set of variables you wish to explore by making density graphs. This is commonly used for scores and other similar numerical variables.significantVar
: is a logical vector separating the regions into by whether they are statistically significant. For example, this information is used to explore the width of all the regions and compare it the significant ones.Other parameters control the name of the report, where it'll be located, the transcripts database used to annotate the nearest genes, graphical parameters, etc.
Here is a short example of how to use renderReport()
. Note that we are using regions produced by r Biocpkg('derfinder')
just for convenience sake.
## Load derfinder library("derfinder") regions <- genomeRegions$regions ## Assign chr length library("GenomicRanges") seqlengths(regions) <- c("chr21" = 48129895) ## The output will be saved in the 'derfinderReport-example' directory dir.create("renderReport-example", showWarnings = FALSE, recursive = TRUE) ## Generate the HTML report report <- renderReport(regions, "Example run", pvalueVars = c( "Q-values" = "qvalues", "P-values" = "pvalues" ), densityVars = c( "Area" = "area", "Mean coverage" = "meanCoverage" ), significantVar = regions$qvalues <= 0.05, nBestRegions = 20, outdir = "renderReport-example" )
See the report created by this example here.
For r Biocpkg('derfinder')
results created via the expressed regions-level approach you can use renderReport()
to explore the results. If you use r Biocpkg('DESeq2')
to perform the differential expression analysis of the expressed regions you can then use DESeq2Report()
.
r Biocpkg('derfinder')
single base-level caser Biocpkg('derfinder')
Prior to using regionReport::derfinderReport()
you must use r Biocpkg('derfinder')
to analyze a specific data set. While there are many ways to do so, we recommend using analyzeChr() with the same prefix argument. Then merging the results with mergeResults(). This is the recommended pipeline for the single base-level approach.
Below, we run r Biocpkg('derfinder')
for the example data included in the package. The steps are:
## Load derfinder library("derfinder") ## The output will be saved in the "derfinderReport-example" directory dir.create("derfinderReport-example", showWarnings = FALSE, recursive = TRUE)
The following code runs r Biocpkg('derfinder')
.
## Save the current path initialPath <- getwd() setwd(file.path(initialPath, "derfinderReport-example")) ## Generate output from derfinder ## Collapse the coverage information collapsedFull <- collapseFullCoverage(list(genomeData$coverage), verbose = TRUE ) ## Calculate library size adjustments sampleDepths <- sampleDepth(collapsedFull, probs = c(0.5), nonzero = TRUE, verbose = TRUE ) ## Build the models group <- genomeInfo$pop adjustvars <- data.frame(genomeInfo$gender) models <- makeModels(sampleDepths, testvars = group, adjustvars = adjustvars) ## Analyze chromosome 21 analysis <- analyzeChr( chr = "21", coverageInfo = genomeData, models = models, cutoffFstat = 1, cutoffType = "manual", seeds = 20140330, groupInfo = group, mc.cores = 1, writeOutput = TRUE, returnOutput = TRUE ) ## Save the stats options for later optionsStats <- analysis$optionsStats ## Change the directory back to the original one setwd(initialPath)
For convenience, we have included the r Biocpkg('derfinder')
results as part of
r Biocpkg('regionReport')
. Note that the above functions are routinely checked as part
of r Biocpkg('derfinder')
.
## Copy previous results file.copy(system.file(file.path("extdata", "chr21"), package = "derfinder", mustWork = TRUE ), "derfinderReport-example", recursive = TRUE)
Next, proceed to merging the results.
## Merge the results from the different chromosomes. In this case, there's ## only one: chr21 mergeResults( chrs = "chr21", prefix = "derfinderReport-example", genomicState = genomicState$fullGenome ) ## Load optionsStats load(file.path("derfinderReport-example", "chr21", "optionsStats.Rdata"), verbose = TRUE)
Once the r Biocpkg('derfinder')
output has been generated and merged, use
derfinderReport() to create the HTML report.
## Load derfindeReport library("regionReport")
## Generate the HTML report report <- derfinderReport( prefix = "derfinderReport-example", browse = FALSE, nBestRegions = 15, makeBestClusters = TRUE, fullCov = list("21" = genomeDataRaw$coverage), optionsStats = optionsStats )
Once the output is generated, you can browse the report from R
using
browseURL() as shown below.
## Browse the report browseURL(report)
You can view a pre-compiled version of this report here.
Note that the reports require an active Internet connection to render correctly.
The report is self-explanatory and will change some of the text depending on the input options.
If the report is taking too long to compile (say more than 3 hours), you might
want to consider setting nBestCluters to a small number or even set
makeBestClusters to FALSE
.
This package was made possible thanks to:
r Citep(bib[['R']])
r Biocpkg('BiocStyle')
r Citep(bib[['BiocStyle']])
r Biocpkg('biovizBase')
r Citep(bib[['biovizBase']])
r Biocpkg('bumphunter')
r Citep(bib[['bumphunter']])
r Biocpkg('DEFormats')
r Citep(bib[['DEFormats']])
r Biocpkg('derfinder')
r Citep(bib[['derfinder']])
r Biocpkg('derfinderPlot')
r Citep(bib[['derfinderPlot']])
r Biocpkg('DESeq2')
r Citep(bib[['DESeq2']])
r CRANpkg('DT')
r Citep(bib[['DT']])
r Biocpkg('edgeR')
r Citep(bib[c('edgeR1', 'edgeR2')])
r Biocpkg('GenomeInfoDb')
r Citep(bib[['GenomeInfoDb']])
r Biocpkg('GenomicRanges')
r Citep(bib[['GenomicRanges']])
r Biocpkg('ggbio')
r Citep(bib[['ggbio']])
r CRANpkg('ggplot2')
r Citep(bib[['ggplot2']])
r CRANpkg('grid')
r Citep(bib[['grid']])
r CRANpkg('gridExtra')
r Citep(bib[['gridExtra']])
r Biocpkg('IRanges')
r Citep(bib[['IRanges']])
r CRANpkg('knitr')
r Citep(bib[['knitr']])
r CRANpkg('knitrBootstrap')
r Citep(bib[['knitrBootstrap']])
r CRANpkg('mgcv')
r Citep(bib[['mgcv']])
r CRANpkg('pheatmap')
r Citep(bib[['pheatmap']])
r CRANpkg('RColorBrewer')
r Citep(bib[['RColorBrewer']])
r CRANpkg("RefManageR")
r Citep(bib[["RefManageR"]])
r CRANpkg('rmarkdown')
r Citep(bib[['rmarkdown']])
r CRANpkg('sessioninfo')
r Citep(bib[['sessioninfo']])
r Biocannopkg('TxDb.Hsapiens.UCSC.hg19.knownGene')
r Citep(bib[['TxDb.Hsapiens.UCSC.hg19.knownGene']])
r CRANpkg('whisker')
r Citep(bib[['whisker']])
Code for creating the vignette
## Create the vignette library("rmarkdown") system.time(render("regionReport.Rmd", "BiocStyle::html_document")) ## Extract the R code library("knitr") knit("regionReport.Rmd", tangle = TRUE)
## Clean up unlink("derfinderReport-example", recursive = TRUE)
Date the vignette was generated.
## Date the report was generated Sys.time()
Wallclock time spent generating the vignette.
## Processing time in seconds totalTimeVignette <- diff(c(startTimeVignette, Sys.time())) round(totalTimeVignette, digits = 3)
R
session information.
## Session info library("sessioninfo") options(width = 120) session_info()
This vignette was generated using r Biocpkg('BiocStyle')
r Citep(bib[['BiocStyle']])
with r CRANpkg('knitr')
r Citep(bib[['knitr']])
and r CRANpkg('rmarkdown')
r Citep(bib[['rmarkdown']])
running behind the scenes.
Citations made with r CRANpkg('RefManageR')
r Citep(bib[['RefManageR']])
.
## Print bibliography PrintBibliography(bib, .opts = list(hyperlink = "to.doc", style = "html"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.