library(BiocStyle)
The r Rpackage("sechm")
package is a wrapper around the
ComplexHeatmap
package to facilitate the creation of annotated heatmaps from objects of the
Bioconductor class r Biocpkg("SummarizedExperiment")
(and extensions thereof).
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("sechm")
To showcase the main functions, we will use an example object which contains (a subset of) RNAseq of mouse hippocampi after Forskolin-induced long-term potentiation:
suppressPackageStartupMessages({ library(SummarizedExperiment) library(sechm) }) data("Chen2017", package="sechm") SE <- Chen2017
This is taken from Chen et al., 2017.
The sechm
function simplifies the generation of heatmaps from
SummarizedExperiment
. It minimally requires, as input, a
SummarizedExperiment
object and a set of genes (or features, i.e. rows of
sechm
) to plot:
g <- c("Egr1", "Nr4a1", "Fos", "Egr2", "Sgk1", "Arc", "Dusp1", "Fosb", "Sik1") sechm(SE, features=g) # with row scaling: sechm(SE, features=g, do.scale=TRUE)
The assay can be selected, and any rowData
or colData
columns can be
specified as annotation:
rowData(SE)$meanLogCPM <- rowMeans(assays(SE)$logcpm) sechm(SE, features=g, assayName="logFC", top_annotation=c("Condition","Time"), left_annotation=c("meanLogCPM"))
Column names are ommitted by default, but can be displayed:
sechm(SE, features=g, do.scale=TRUE, show_colnames=TRUE)
Since sechm
uses the
ComplexHeatmap
engine for plotting, any argument of ComplexHeatmap::Heatmap
can be passed:
sechm(SE, features=g, do.scale=TRUE, row_title="My genes")
When plotting a lot of rows, by default row names are not shown (can be
overriden), but specific genes can be highlighted with the mark
argument:
sechm(SE, features=row.names(SE), mark=g, do.scale=TRUE, top_annotation=c("Condition","Time"))
We can also add gaps using the same columns:
sechm(SE, features=g, do.scale=TRUE, top_annotation="Time", gaps_at="Condition")
By default, rows are sorted using the MDS angle method (can be altered with
the sort.method
argument); this can be disabled with:
# reverts to clustering: sechm(SE, features=row.names(SE), do.scale=TRUE, sortRowsOn=NULL) # no reordering: sechm(SE, features=row.names(SE), do.scale=TRUE, sortRowsOn=NULL, cluster_rows=FALSE)
It is also possible to combine sorting with clusters using the toporder
argument, or using gaps:.
# we first cluster rows, and save the clusters in the rowData: rowData(SE)$cluster <- as.character(kmeans(t(scale(t(assay(SE)))),5)$cluster) sechm(SE, features=1:30, do.scale=TRUE, toporder="cluster", left_annotation="cluster", show_rownames=FALSE) sechm(SE, features=1:30, do.scale=TRUE, gaps_row="cluster", show_rownames=FALSE)
sechm
tries to guess whether the data plotted are centered around zero, and
adjusts the scale accordingly (this can be disable with breaks=FALSE
). It
also performs a quantile capping to avoid extreme values taking most of the
color scale, which is especially relevant when plotting for instance
fold-changes. This can be controlled with the breaks
argument. Consider the
three following examples:
library(ComplexHeatmap) g2 <- c(g,"Gm14288",tail(row.names(SE))) draw( sechm(SE, features=g2, assayName="logFC", breaks=1, column_title="breaks=1") + sechm(SE, features=g2, assayName="logFC", breaks=0.995, column_title="breaks=0.995", name="logFC(2)") + sechm(SE, features=g2, assayName="logFC", breaks=0.985, column_title="breaks=0.985", name="logFC(3)"), merge_legends=TRUE)
With breaks=1
, the scale is made symmetric, but not quantile capping is
performed. In this way, most of the colorscale is taken by the difference
between one datapoint (first gene) and the rest, making it difficult to
distinguish patterns in the genes at the bottom. Instead, with breaks=0.985
,
the color scale is linear up until the 0.985 quantile of the data, and ordinal
after this. This reduces our capacity to distinguish variations between the
extreme values, but enables us to visualize the others better.
Manual breaks can also be defined. The colors themselves can be passed as follows:
# not run sechm(SE, features=g2, hmcols=viridisLite::cividis(10))
Annotation colors can be passed with the anno_colors
argument, but the
simplest is to store them in the object's metadata:
metadata(SE)$anno_colors metadata(SE)$anno_colors$Condition <- c(Control="white", Forskolin="black") sechm(SE, features=g2, top_annotation="Condition")
Heatmap colors can be passed on in the same way:
metadata(SE)$hmcols <- c("darkred","white","darkblue") sechm(SE, g, do.scale = TRUE)
The default assay to be displayed and the default annotation fields to show can
be specified in the default_view
metadata element, as follows:
metadata(SE)$default_view <- list( assay="logFC", top_annotation="Condition" )
Finally, it is also possible to set colors as package-wide options:
setSechmOption("hmcols", value=c("white","grey","black")) sechm(SE, g, do.scale = TRUE)
At the moment, the following arguments can be set as global options:
assayName
, hmcols
, left_annotation
, right_annotation
, top_annotation
,
bottom_annotation
, anno_colors
, gaps_at
, breaks
.
To remove the predefined colors:
resetAllSechmOptions() metadata(SE)$hmcols <- NULL metadata(SE)$anno_colors <- NULL
In order of priority, the arguments in the function call trump the object's metadata, which trumps the global options.
Because sechm
produces a Heatmap
object from
ComplexHeatmap, it
is possible to combine them:
sechm(SE, features=g) + sechm(SE, features=g)
However, doing so involves manual work to ensure that the labels and colors are
nice and coherent, and that the rows names match. As a convenience, we provide
the crossHm
function to handle these issues. crossHm
works with a list of
SummarizedExperiment
objects:
# we build another SE object and introduce some variation in it: SE2 <- SE assays(SE2)$logcpm <- jitter(assays(SE2)$logcpm, factor=1000) crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE, top_annotation=c("Condition","Time"))
Scaling is applied to the datasets separately. A unique color scale can be enforced:
crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE, top_annotation=c("Condition","Time"), uniqueScale = TRUE)
The package also includes a number of other convenience functions which we briefly describe here (see the functions' help for more information):
log2FC()
adds two assays to an SE object, containing per-sample
log2-foldchanges, as well as scaledLFC (variance-scaled log2-foldchanges, but
without centering, so that the controls stay around 0) relative to the mean
of the (specified) controls.getDEA()
and getDEGs()
functions can return a specific DEA or its set
of differentially-expressed features, provided that the DEA results tables are
each saved in a column of rowData (i.e. the whole table in one column), with a
column name starting with DEA.
.meltSE()
function can be used to extract a dataframe (suitable for
ggplot) containing colData, rowData, and assay data for a given subset of
features.sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.