library(BiocStyle)

Getting started

The r Rpackage("sechm") package is a wrapper around the ComplexHeatmap package to facilitate the creation of annotated heatmaps from objects of the Bioconductor class r Biocpkg("SummarizedExperiment") (and extensions thereof).

Package installation

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("sechm")

Example data

To showcase the main functions, we will use an example object which contains (a subset of) RNAseq of mouse hippocampi after Forskolin-induced long-term potentiation:

suppressPackageStartupMessages({
  library(SummarizedExperiment)
  library(sechm)
})
data("Chen2017", package="sechm")
SE <- Chen2017

This is taken from Chen et al., 2017.

Example usage

Basic functionalities

The sechm function simplifies the generation of heatmaps from SummarizedExperiment. It minimally requires, as input, a SummarizedExperiment object and a set of genes (or features, i.e. rows of sechm) to plot:

g <- c("Egr1", "Nr4a1", "Fos", "Egr2", "Sgk1", "Arc", "Dusp1", "Fosb", "Sik1")
sechm(SE, features=g)
# with row scaling:
sechm(SE, features=g, do.scale=TRUE)

The assay can be selected, and any rowData or colData columns can be specified as annotation:

rowData(SE)$meanLogCPM <- rowMeans(assays(SE)$logcpm)
sechm(SE, features=g, assayName="logFC", top_annotation=c("Condition","Time"), left_annotation=c("meanLogCPM"))

Column names are ommitted by default, but can be displayed:

sechm(SE, features=g, do.scale=TRUE, show_colnames=TRUE)

Since sechm uses the ComplexHeatmap engine for plotting, any argument of ComplexHeatmap::Heatmap can be passed:

sechm(SE, features=g, do.scale=TRUE, row_title="My genes")

When plotting a lot of rows, by default row names are not shown (can be overriden), but specific genes can be highlighted with the mark argument:

sechm(SE, features=row.names(SE), mark=g, do.scale=TRUE, top_annotation=c("Condition","Time"))

We can also add gaps using the same columns:

sechm(SE, features=g, do.scale=TRUE, top_annotation="Time", gaps_at="Condition")

Row ordering

By default, rows are sorted using the MDS angle method (can be altered with the sort.method argument); this can be disabled with:

# reverts to clustering:
sechm(SE, features=row.names(SE), do.scale=TRUE, sortRowsOn=NULL)
# no reordering:
sechm(SE, features=row.names(SE), do.scale=TRUE, sortRowsOn=NULL, 
      cluster_rows=FALSE)

It is also possible to combine sorting with clusters using the toporder argument, or using gaps:.

# we first cluster rows, and save the clusters in the rowData:
rowData(SE)$cluster <- as.character(kmeans(t(scale(t(assay(SE)))),5)$cluster)
sechm(SE, features=1:30, do.scale=TRUE, toporder="cluster", 
      left_annotation="cluster", show_rownames=FALSE)
sechm(SE, features=1:30, do.scale=TRUE, gaps_row="cluster",
      show_rownames=FALSE)

Color scale

sechm tries to guess whether the data plotted are centered around zero, and adjusts the scale accordingly (this can be disable with breaks=FALSE). It also performs a quantile capping to avoid extreme values taking most of the color scale, which is especially relevant when plotting for instance fold-changes. This can be controlled with the breaks argument. Consider the three following examples:

library(ComplexHeatmap)
g2 <- c(g,"Gm14288",tail(row.names(SE)))
draw(
    sechm(SE, features=g2, assayName="logFC", breaks=1, column_title="breaks=1") + 
    sechm(SE, features=g2, assayName="logFC", breaks=0.995, 
          column_title="breaks=0.995", name="logFC(2)") + 
    sechm(SE, features=g2, assayName="logFC", breaks=0.985, 
          column_title="breaks=0.985", name="logFC(3)"),
    merge_legends=TRUE)

With breaks=1, the scale is made symmetric, but not quantile capping is performed. In this way, most of the colorscale is taken by the difference between one datapoint (first gene) and the rest, making it difficult to distinguish patterns in the genes at the bottom. Instead, with breaks=0.985, the color scale is linear up until the 0.985 quantile of the data, and ordinal after this. This reduces our capacity to distinguish variations between the extreme values, but enables us to visualize the others better.

Manual breaks can also be defined. The colors themselves can be passed as follows:

# not run
sechm(SE, features=g2, hmcols=viridisLite::cividis(10))

Annotation colors

Annotation colors can be passed with the anno_colors argument, but the simplest is to store them in the object's metadata:

metadata(SE)$anno_colors
metadata(SE)$anno_colors$Condition <- c(Control="white", Forskolin="black")
sechm(SE, features=g2, top_annotation="Condition")

Heatmap colors can be passed on in the same way:

metadata(SE)$hmcols <- c("darkred","white","darkblue")
sechm(SE, g, do.scale = TRUE)

The default assay to be displayed and the default annotation fields to show can be specified in the default_view metadata element, as follows:

metadata(SE)$default_view <- list(
  assay="logFC",
  top_annotation="Condition"
)

Finally, it is also possible to set colors as package-wide options:

setSechmOption("hmcols", value=c("white","grey","black"))
sechm(SE, g, do.scale = TRUE)

At the moment, the following arguments can be set as global options: assayName, hmcols, left_annotation, right_annotation, top_annotation, bottom_annotation, anno_colors, gaps_at, breaks.

To remove the predefined colors:

resetAllSechmOptions()
metadata(SE)$hmcols <- NULL
metadata(SE)$anno_colors <- NULL

In order of priority, the arguments in the function call trump the object's metadata, which trumps the global options.

crossHm

Because sechm produces a Heatmap object from ComplexHeatmap, it is possible to combine them:

sechm(SE, features=g) + sechm(SE, features=g)

However, doing so involves manual work to ensure that the labels and colors are nice and coherent, and that the rows names match. As a convenience, we provide the crossHm function to handle these issues. crossHm works with a list of SummarizedExperiment objects:

# we build another SE object and introduce some variation in it:
SE2 <- SE
assays(SE2)$logcpm <- jitter(assays(SE2)$logcpm, factor=1000)
crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE, 
        top_annotation=c("Condition","Time"))

Scaling is applied to the datasets separately. A unique color scale can be enforced:

crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE, 
        top_annotation=c("Condition","Time"), uniqueScale = TRUE)

Other convenience functions

The package also includes a number of other convenience functions which we briefly describe here (see the functions' help for more information):



Session info {.unnumbered}

sessionInfo()


plger/sechm documentation built on Oct. 25, 2024, 8:22 p.m.