Epiviz
is an interactive visualization tool for functional genomics data. It supports genome navigation like other genome browsers, but allows multiple visualizations of data within genomic regions using scatterplots, heatmaps and other user-supplied visualizations. It also includes data from the Gene Expression Barcode project for transcriptome visualization. It has a flexible plugin framework so users can add d3 visualizations. You can find more information about Epiviz at http://epiviz.cbcb.umd.edu/help and see a video tour here.
The r Biocpkg("epivizr")
package implements two-way communication between the R/Bioconductor
computational genomics environment and Epiviz
. Objects in an R
session can be displayed as tracks or plots on Epiviz. Epivizr uses Websockets for communication between the browser Javascript client and the R environment, the same technology underlying the popular Shiny system for authoring interactive web-based reports in R.
In this vignette we will look at colon cancer methylation data from the TCGA project and expression data from the gene expression barcode project. The r Biocpkg("epivizrData")
package contains human chromosome 11 methylation data from the Illumina 450kHumanMethylation beadarray processed with the r Biocpkg("minfi")
package. We use expression data from the r Biocpkg("antiProfilesData")
bioconductor package.
library(epivizr) library(antiProfilesData) library(SummarizedExperiment)
data(tcga_colon_blocks) data(tcga_colon_curves) data(apColonData)
The tcga_colon_blocks
object is a GRanges
object containing chromosome 11 regions of hypo or hyper methylation in colon cancer identified using the blockFinder
function in the r Biocpkg("minfi")
package.
show(tcga_colon_blocks)
The columns value
and p.value
can be used to determine which of these regions, or blocks, are interesting by looking at a volcano plot for instance.
plot(tcga_colon_blocks$value, -log10(tcga_colon_blocks$p.value), main="Volcano plot", xlab="Avg. methylation difference", ylab="-log10 p-value",xlim=c(-.5,.5))
The tcga_colon_curves
object is another GRanges
object which contains the basepair resolution methylation data used to define these regions.
show(tcga_colon_curves)
This basepair resolution data includes mean methylation levels for normal and cancer and a smoothed estimate of methylation difference. This smoothed difference estimate is used to define regions in the tcga_colon_blocks
object.
Finally, the apColonData
object is an ExpressionSet
containing gene expression data for colon normal and tumor samples for genes within regions of methylation loss identified this paper. Our goal in this vignette is to visualize this data as we browse the genome.
The connection to Epiviz
is managed through a session manager object of class EpivizApp
. We can create this object and open Epiviz
using the startEpiviz
function.
app <- startEpiviz(workspace="qyOTB6vVnff", gists="2caf8e891201130c7daa")
app <- startEpiviz(host="http://localhost", http_port=8989, debug=TRUE, open_browser=FALSE, non_interactive=TRUE, try_ports=TRUE) # register BlockTrack js_chart_settings <- list(list(id = "title", type = "string", defaultValue = "", label = "Title", possibleValues = NULL), list(id = "marginTop", type = "number", defaultValue = 25, label = "Top margin", possibleValues = NULL), list(id = "marginBottom", type = "number", defaultValue = 23, label = "Bottom margin", possibleValues = NULL), list(id = "marginLeft", type = "number", defaultValue = 20, label = "Left margin", possibleValues = NULL), list(id = "marginRight", type = "number", defaultValue = 10, label = "Right margin", possibleValues = NULL), list(id = "minBlockDistance", type = "number", defaultValue = 5, label = "Minimum block distance", possibleValues = NULL)) js_chart_colors = c("#1f77b4", "#ff7f0e", "#2ca02c", "#d62728", "#9467bd", "#8c564b", "#e377c2", "#7f7f7f", "#bcbd22", "#17becf") app$chart_mgr$register_chart_type("BlocksTrack", "epiviz.plugins.charts.BlocksTrack", js_chart_settings=js_chart_settings, js_chart_colors=js_chart_colors) # register LineTrack js_chart_settings <- list(list(id = "title", type = "string", defaultValue = "", label = "Title", possibleValues = NULL), list(id = "marginTop", type = "number", defaultValue = 25, label = "Top margin", possibleValues = NULL), list(id = "marginBottom", type = "number", defaultValue = 23, label = "Bottom margin", possibleValues = NULL), list(id = "marginLeft", type = "number", defaultValue = 20, label = "Left margin", possibleValues = NULL), list(id = "marginRight", type = "number", defaultValue = 10, label = "Right margin", possibleValues = NULL), list(id = "measurementGroupsAggregator", type = "categorical", defaultValue = "mean-stdev", label = "Aggregator for measurement groups", possibleValues = c("mean-stdev", "quartiles", "count", "min", "max", "sum")), list(id = "step", type = "number", defaultValue = 50, label = "Step", possibleValues = NULL), list(id = "showPoints", type = "boolean", defaultValue = FALSE, label = "Show points", possibleValues = NULL), list(id = "showLines", type = "boolean", defaultValue = TRUE, label = "Show lines", possibleValues = NULL), list(id = "showErrorBars", type = "boolean", defaultValue = TRUE, label = "Show error bars", possibleValues = NULL), list(id = "pointRadius", type = "number", defaultValue = 1, label = "Point radius", possibleValues = NULL), list(id = "lineThickness", type = "number", defaultValue = 1, label = "Line thickness", possibleValues = NULL), list(id = "yMin", type = "number", defaultValue = "default", label = "Min Y", possibleValues = NULL), list(id = "yMax", type = "number", defaultValue = "default", label = "Max Y", possibleValues = NULL), list(id = "interpolation", type = "categorical", defaultValue = "linear", label = "Interpolation", possibleValues = c("linear", "step-before", "step-after", "basis", "basis-open", "basis-closed", "bundle", "cardinal", "cardinal-open", "monotone"))) js_chart_colors <- c("#1859a9", "#ed2d2e", "#008c47", "#010101", "#f37d22", "#662c91", "#a11d20", "#b33893") app$chart_mgr$register_chart_type("LineTrack", "epiviz.plugins.charts.LineTrack", js_chart_settings=js_chart_settings, js_chart_colors=js_chart_colors) # register ScatterPlot js_chart_settings <- list(list(id = "title", type = "string", defaultValue = "", label = "Title", possibleValues = NULL), list(id = "marginTop", type = "number", defaultValue = 15, label = "Top margin", possibleValues = NULL), list(id = "marginBottom", type = "number", defaultValue = 50, label = "Bottom margin", possibleValues = NULL), list(id = "marginLeft", type = "number", defaultValue = 50, label = "Left margin", possibleValues = NULL), list(id = "marginRight", type = "number", defaultValue = 15, label = "Right margin", possibleValues = NULL), list(id = "measurementGroupsAggregator", type = "categorical", defaultValue = "mean-stdev", label = "Aggregator for measurement groups", possibleValues = c("mean-stdev", "quartiles", "count", "min", "max", "sum")), list(id = "circleRadiusRatio", type = "number", defaultValue = 0.015, label = "Circle radius ratio", possibleValues = NULL), list(id = "xMin", type = "number", defaultValue = "default", label = "Min X", possibleValues = NULL), list(id = "xMax", type = "number", defaultValue = "default", label = "Max X", possibleValues = NULL), list(id = "yMin", type = "number", defaultValue = "default", label = "Min Y", possibleValues = NULL), list(id = "yMax", type = "number", defaultValue = "default", label = "Max Y", possibleValues = NULL)) js_chart_colors <- c("#1f77b4", "#ff7f0e", "#2ca02c", "#d62728", "#9467bd", "#8c564b", "#e377c2", "#7f7f7f", "#bcbd22", "#17becf") app$chart_mgr$register_chart_type("ScatterPlot", "epiviz.plugins.charts.ScatterPlot", js_chart_settings=js_chart_settings, js_chart_colors=js_chart_colors) app$server$start_server()
This opens a websocket connection between the interactive R
session and the browser client. This will allow us to visualize data stored in the Epiviz
server along with data in the interactive R
session.
Windows users: In Windows platforms we need to use the service
function to let the interactive R
session connect to the epiviz
web app and serve data requests. We then escape (using ctl-c
or esc
depending on your environment) to continue with the interactive R
session. This is required anytime you want epivizr
to serve data to the web app, for example, when interacting with the UI. (We are actively developing support for non-blocking sessions in Windows platforms).
app$server$service()
Once the browser is open and is connected to an active epivizr session, the epivizr session registers available chart types supported in the epiviz JS app. To list available chart types we use method list_chart_types
. In this vignette, we only show three chart types available for use in the epiviz JS app. The production epiviz JS app has a larger number of chart types available. Chart types can also be added dynamically as described in the epiviz JS documentation: http://epiviz.cbcb.umd.edu/help. Chart types that are added dynamically are listed and usable within an epivizr
session.
app$chart_mgr$list_chart_types()
Chart type settings, e.g., line widths, colors, margins etc. can be modified dynamically. Settings available for each chart type are briefly listed in this call. See below for further detail on customizing charts through settings and colors.
Once the browser is open we can visualize the tcga_colon_blocks
object containing blocks of methylation modifications in colon cancer.
We use the plot
method to do so.
blocks_chart <- app$plot(tcga_colon_blocks, datasource_name="450k colon_blocks")
Windows users: We need the service
function to let the interactive R
session serve data requests from the browser client as you interact with the UI.
Escape (using ctl-c
or esc
depending on your environment) to continue with the interactive R
session.
app$server$service()
You should now see that a new track is added to the Epiviz
web app. You can think of this track as an interactive display device in R
. As you navigate on the browser, data is requested from the R
session through the websocket connection. Remember to escape to continue with your interactive R
session if you are not running "non-blocking" mode. The blocks_chart
object inherits from class EpivizChart
, which we can use to get information about the data being displayed and the chart used to display it. Note that the "brushing" effect we implement in Epiviz
works for epivizr
tracks as well.
Now that we have interactive data connections to Epiviz
we may want to iteratively compute and visualize our data. For example, we may want to only display methylation blocks inferred at a certain statistical significance level. In this case, we will filter by block size.
# subset to those with length > 250Kbp keep <- width(tcga_colon_blocks) > 250000 # get the data object for chart ms_obj <- app$get_ms_object(blocks_chart) app$update_measurements(ms_obj, tcga_colon_blocks[keep,])
Now, only this subset of blocks will be displayed in the already existing track.
To modify default chart colors or settings, we need to know what settings can be applied to the chart. Again, as these are defined dynamically in the epiviz JS app we use a method, print_chart_type_info
on the EpivizChartMgr
class to list settings that can be applied to specific chart type. For example, to list chart settings available for a BlocksTrack
chart we use the following:
app$chart_mgr$print_chart_type_info("BlocksTrack")
There are two ways to set settings and colors to a chart: when the chart is initially created using the plot
method, or after the
chart is created using the set
method in class EpivizChart
. We will illustrate both ways of doing this:
If using the plot
method, use the list_chart_settings
method as above to list settings that can be applied to a chart type.
For example, for a BlocksTrack
chart, we can set minBlockDistance
as a setting for the plot function which controls how close genomic
regions can be before they are merged into a single rectangle in the chart. We can also change colors used in the chart this way.
settings <- list(minBlockDistance=50) colors <- c("#d15014", "#5e97eb", "#e81ccd") blocks_chart <- app$plot(tcga_colon_blocks, datasource_name="450k colon_blocks", settings=settings, colors=colors)
This will create a second blocks track using a different color map.
On the other hand, if the BlockChart
is already added to the epiviz JS application session, use the set_chart_settings
method to update settings and colors.
# create a list of settings and colors to update the plot settings <- list(minBlockDistance=100) colors <- c("#5e97eb", "#d15014", "#e81ccd") app$chart_mgr$set_chart_settings(blocks_chart, settings=settings, colors=colors)
This changes the color map for the second blocks track added. Use the print_chart_info
method to list settings and colors currently used in the chart.
# to list applied chart settings app$chart_mgr$print_chart_info(blocks_chart)
Methods are also available directly on the EpivizChart
objects. E.g., calls blocks_chart$set(settings=settings, colors=colors)
and blocks_chart$print_info()
yield the same result.
Charts can be printed as a pdf or png file through the epivizr session. To do so, use the print_chart
method:
app$chart_mgr$print_chart(blocks_chart, file_name="blocks_chart", file_type="pdf")
This will create a file named blocks_chart.pdf
which will be downloaded through your web browser. It will be found in the default file download location for your web browser.
There are a number of different data types available to use through epivizr
. You can see a list of R/BioC classes that are supported using ?register
. In the previous section, we used the block
data type to register a GenomicRanges
object. To visualize methylation data at base-pair resolution from the tcga_colon_curves
GenomicRanges
object, we will use the bp
type.
# add low-filter smoothed methylation estimates means_track <- app$plot(tcga_colon_curves, datasource_name="450kMeth",type="bp",columns=c("cancerMean","normalMean"))
NOTE: You can adjust track settings to change how this new track looks like. For instance, to show all points in the window set the step
parameter to 1, and to see a smooth interpolation of the data set the interpolation
parameter to basis
:
means_track$set(settings=list(step=1, interpolation="basis"))
Notice that we added two lines in this plot, one for mean methylation in cancer and another for mean methylation in normal. The columns
argument specifies which columns in mcols(colon_curves)
will be displayed.
We can also add a track containing the smooth methylation difference estimate used to define methylation blocks.
diff_chart <- app$plot(tcga_colon_curves, datasource_name="450kMethDiff",type="bp",columns=c("smooth"),ylim=matrix(c(-.5,.5),nc=1))
We pass limits for the y axis in this case. To see other arguments supported, you can use the help framework in R ?"EpivizApp"
. As before, we can specify settings for this new track.
We can use the app connection object to list charts we have added so far, or to remove charts.
app$chart_mgr$list_charts() app$chart_mgr$rm_chart(means_track) app$chart_mgr$list_charts()
Now we want to visualize the colon expression data in apColonData
object as an MA plot in Epiviz
. First, we add an "MA"
assay to the ExpressionSet
:
keep <- pData(apColonData)$SubType!="adenoma" apColonData <- apColonData[,keep] status <- pData(apColonData)$Status Indexes <- split(seq(along=status),status) exprMat <- exprs(apColonData) mns <- sapply(Indexes, function(ind) rowMeans(exprMat[,ind])) mat <- cbind(colonM=mns[,"1"]-mns[,"0"], colonA=0.5*(mns[,"1"]+mns[,"0"])) pd <- data.frame(stat=c("M","A")) rownames(pd) <- colnames(mat) maEset <- ExpressionSet( assayData=mat, phenoData=AnnotatedDataFrame(pd), featureData=featureData(apColonData), annotation=annotation(apColonData) ) show(maEset)
epivizr
will use the annotation(maEset)
annotation to determine genomic locations using the AnnotationDbi
package so that only probesets inside the current browser window are displayed.
eset_chart <- app$plot(maEset, datasource_name="MAPlot", columns=c("colonA","colonM"))
In this case, we specify which data is displayed in each axis of the scatter plot using the columns
argument. The assay
arguments indicates where data is obtained.
Epiviz
is also able to display plots of data in the form of a RangedSummarizedExperiment
object. After loading the tcga_colon_expression
dataset in the epivizrData
package, we can see that this object contains information on 239322 exons in 40 samples.
data(tcga_colon_expression) show(tcga_colon_expression)
The assay
slot holds a matrix of raw sequencing counts, so before we can plot a scatterplot showing expression, we must first normalize the count data. We use the geometric mean of each row as a reference sample to divide each column (sample) by, then use the median of each column as a scaling factor to divide each row (exon) by.
ref_sample <- 2 ^ rowMeans(log2(assay(tcga_colon_expression) + 1)) scaled <- (assay(tcga_colon_expression) + 1) / ref_sample scaleFactor <- Biobase::rowMedians(t(scaled)) assay_normalized <- sweep(assay(tcga_colon_expression), 2, scaleFactor, "/") assay(tcga_colon_expression) <- assay_normalized
Now, using the expression data in the assay
slot and the sample data in the colData
slot, we can compute mean exon expression by sample type.
status <- colData(tcga_colon_expression)$sample_type index <- split(seq(along = status), status) logCounts <- log2(assay(tcga_colon_expression) + 1) means <- sapply(index, function(ind) rowMeans(logCounts[, ind])) mat <- cbind(cancer = means[, "Primary Tumor"], normal = means[, "Solid Tissue Normal"])
Now, create a new RangedSummarizedExperiment
object with the two column matrix, and all the information about the features of interest, in this case exons, are stored in the rowRanges
slot to be queried by Epiviz
.
sumexp <- SummarizedExperiment(mat, rowRanges=rowRanges(tcga_colon_expression)) se_chart <- app$plot(sumexp, datasource_name="Mean by Sample Type", columns=c("normal", "cancer"))
Again, the columns
argument specifies what data will be displayed along which axis.
Epiviz web server (http://epiviz.cbcb.umd.edu) currently hosts data sets from several sources including Gene Expression Barcode project. We provide a way to load these remotely hosted datasets and integrate/analyze with your current workflow and local data. For this, Lets first get the list of measurements available from the webserver
app$load_remote_measurements() remote_measurements <- app$data_mgr$get_remote_measurements()
remote_measurements
is a list of EpivizMeasurement
objects. We can query this list to choose the datasets we would like to load from the webserver. For the purpose of this vignette, lets find measurements to visualize data from gene expression barcode project for tumor and normal samples from lung, colon and breast tissues.
measurementList <- lapply(remote_measurements, function(m) { if(m@name %in% c("colon normal", "lung normal", "breast normal", "colon tumor", "lung tumor", "breast tumor") && m@datasourceId == "gene_expression_barcode_subtype") { m } }) measurements <- Filter(Negate(is.null), measurementList)
Now lets add a heatmap using the measurements we just selected. For this we use the visualize
function from the EpivizChartMgr
class.
app$chart_mgr$visualize("HeatmapPlot", measurements = measurements)
Similarly other chart types registered with the current epivizr session can be used with the datasets from the webserver.
We can navigate to a location on the genome using the navigate
method of the app object:
app$navigate("chr11", 110000000, 120000000)
There is a convenience function to quickly navigate to a series of locations in succession. We can use that to browse the genome along a ranked list of regions. Let's navigate to the 5 most up-regulated exons in the colon exon expression data.
foldChange <- mat[,"cancer"]-mat[,"normal"] ind <- order(foldChange,decreasing=TRUE) # bounding 1Mb around each exon slideshowRegions <- trim(rowRanges(sumexp)[ind] + 1000000L) app$slideshow(slideshowRegions, n=5)
To print the current epiviz workspace as a pdf or png, we use the print_workspace
method:
app$print_workspace(file_name="workspace", file_type="pdf")
This will create a file named workspace.pdf
and will be downloaded through your web browser. It will be saved to the default file download location for your web browser.
To save the current state of the epiviz app, and UI workspace, into an rda
file, we use the save
method:
app$save(file="app.rda", include_data=TRUE)
This will create a file named app.rda
that can be restarted for later use and analysis. For a smaller file size, you may choose whether to include or exclude the data when saving the workspace. In this case, expressions used to add data
to the app, through add_measurements
or plot
as shown above, will be stored
and re-evaluated when restarting the app.
After a workspace is saved using the save
method shown above, we can replot it using the following:
app <- restartEpiviz(file="app.rda", open_browser=TRUE)
This will recreate the workspace in your web browser and reconnect it with the R session. If the workspace you are restarting did not include its data when saving the file, you will need to load the data in your global environment before restarting epiviz. This will allow to reload the data used in the app.
To close the connection to Epiviz
and remove all tracks added during the interactive session, we use the stop_app
function.
app$stop_app()
The epivizrStandalone
all files required to run the web app UI locally. This feature
can be used to browse any genome of interest. See that packages vignette for more information.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.