```{css, echo=FALSE, eval=TRUE} pre code { white-space: pre !important; overflow-x: scroll !important; word-break: keep-all !important; word-wrap: initial !important; }

```r

options(width=60, max.print=1000)
knitr::opts_chunk$set(
    eval=as.logical(Sys.getenv("KNITR_EVAL", "TRUE")),
    cache=as.logical(Sys.getenv("KNITR_CACHE", "TRUE")), 
    tidy.opts=list(width.cutoff=60), 
    tidy=TRUE,
    eval = FALSE)
# shiny::tagList(rmarkdown::html_dependency_font_awesome())
suppressPackageStartupMessages({
    library(systemPipeShiny)
    library(spsBio)
})
cat("<style>")
cat(readLines(system.file("app/www/css/sps.css", package = "systemPipeShiny")),sep = "\n")
cat("</style>")
knitr::include_graphics(path = "../inst/app/www/img/sps.png")

Introduction

spsBio is a systemPipeShiny(SPS) plugin that provides additional plotting functions and SPS tabs visualize biological data.

Quick start

Install

To install SPS:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("systemPipeR/systemPipeShiny", build_vignettes=TRUE, dependencies=TRUE)
BiocManager::install("systemPipeR/spsBio", build_vignettes=TRUE, dependencies=TRUE)

Add the plugin

Before starting with SPS, you need to create a SPS project:

sps_tmp_dir <- tempdir()
systemPipeShiny::spsInit(dir_path = sps_tmp_dir, change_wd = FALSE, project_name = "SPSProject")
sps_dir <- file.path(sps_tmp_dir, "SPSProject")

Here for building the vignette we are not switching to the app directory and are using a temp directory, in a real case, you shouldn't store you SPS project in a temp directory. Just use the following instead:

systemPipeShiny::spsInit()

Once the project is created, change your working directory to the project if you selected change_wd = FALSE in the previous step. Then run:

systemPipeShiny::spsLoadPlugin("spsBio")
systemPipeShiny::spsLoadPlugin("spsBio", app_path = sps_dir)

That should be all you need. In your global.R, specify you want to run this plugin when app starts:

sps_app <- sps(
    vstabs = "",
    plugin = "spsBio",
    server_expr = {
        msg("Custom expression runs -- Hello World", "GREETING", "green")
    }
)

Visualization

SPS offers an interactive data visualization framework. spsBio extends this functionality with additional visualization tabs and functions. These tabs are focused on biological data visualization on various results. Users can upload different input data types, and apply various options for preprocessing those datasets. Users can then create downstream analysis plots, as per the type of uploaded data. Some available plotting options include: bar plots of differentially expressed genes, heat maps, dendrogram, principal component analysis (PCA) plots, and multidimensional scaling (MDS) plots. Depending on the nature of the plots, there are also options to adjust the plot such as normalizing the data. Additionally, spsBio provides users with plot templates and plotting functions that they can then customize according to their necessities for visualization.

Table with all exported functions

| Function Name | Description | |-----------------|-----------------------------------------------------------------| | exploreDSS | Transform raw read counts using the \code{DESeq2} package | | exploreDDSplot| Scatterplot of transformed counts reads | | PCAplot | Plots PCA from a count matrix | | MDSplot | Plots MDS from a count matrix | | tSNEplot | Plots t-Distributed Stochastic Neighbor embedding | | GLMplot | Plots Dimension Reduction with GLMplot | | heatMaplot | Plots Hierarchical Clustering HeatMap | | MAplot | MA-Plot from base means and log fold changes | | volcanoplot | Plots a Volcano Plot from an DEG analyis results | | hclustplot | Plots Hierarchical Clustering Dendrogram |

Data transformations and visualization

To show the effect of the transformation, in the figure below we plot the first sample against the second, first simply using the log2 function, and then using the VST and rlog-transformed values. For the log2 approach, we need to first estimate size factors to account for sequencing depth, and then specify normalized=TRUE. Sequencing depth correction is done automatically for the vst and rlog.

## Targets file
targetspath <- system.file("extdata", "targets.txt", package="systemPipeR")
targets <- read.delim(targetspath, comment="#")
cmp <- systemPipeR::readComp(file=targetspath, format="matrix", delim="-")
## Count table file
countMatrixPath <- system.file("extdata", "countDFeByg.xls", package="systemPipeR")
countMatrix <- read.delim(countMatrixPath, row.names=1)
## Plot
exploreDDSplot(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, samples=c(3,4))
exploreDDSplot(countMatrix, targets, cmp=cmp[[1]], samples=c("M1A", "M1B"), save = TRUE,
             filePlot = "transf_deseq2.pdf")
## Plot Correlogram
exploreDDSplot(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, samples=c("M1A", "M1B"), scattermatrix=TRUE)

Dendrogram

A dendrogram of the results of hierarchical clustering performed with the hclust function can be created with the hclustplot function. The sample-wise Spearman correlation coefficients are computed, and then the results are transformed to a distance matrix before the hierarchical clustering is performed. The count data frame can be transformed with the rlog or Variance-stabilizing Transformation (vst) methods from the DESeq2 package, or can be done without transformation.

## Data transformation
exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="rlog")
## Plot
hclustplot(exploredds, method = "spearman")
hclustplot(exploredds, method = "spearman", savePlot = TRUE, filePlot = "cor.pdf")

Heatmap

A heatmap of the results of hierarchical clustering performed with the hclust function can be created with the heatMaplot function. The sample-wise Spearman correlation coefficients are computed before hierarchical clustering. The count data frame can be transformed with the rlog or Variance-stabilizing Transformation (vst) methods from the DESeq2 package, or can be done without transformation.

Samples

exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="rlog")
heatMaplot(exploredds, clust="samples")
heatMaplot(exploredds, clust="samples", plotly = TRUE)

Individuals genes identified in DEG analysis

### DEG analysis with `systemPipeR`
degseqDF <- systemPipeR::run_DESeq2(countDF = countMatrix, targets = targets, cmp = cmp[[1]], independent = FALSE)
DEG_list <- systemPipeR::filterDEGs(degDF = degseqDF, filter = c(Fold = 2, FDR = 10))
### Plot
heatMaplot(exploredds, clust="ind", DEGlist = unique(as.character(unlist(DEG_list[[1]]))))
heatMaplot(exploredds, clust="ind", DEGlist = unique(as.character(unlist(DEG_list[[1]]))), plotly = TRUE)

PCA plot

A Principal Component Analysis (PCA) plot can be created using the PCAplot function which uses the DESeq2 package. The input data frame can be transformed with the rlog or Variance-stabilizing Transformation (vst) methods from the DESeq2 package, or can be done without transformation.

## Data transformation
exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="rlog")
## Plot
PCAplot(exploredds, plotly = FALSE)
PCAplot(exploredds, plotly = TRUE)

In addition, generalized principal component analysis (GLM-PCA) for dimension reduction of non-normally distributed data can be plotted with the GLMplot function [@Townes2019]. This option does not offer transformation or normalization of raw data.

## Data transformation
exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="raw")
## Plot
GLMplot(exploredds, plotly = FALSE)
GLMplot(exploredds, plotly = FALSE, savePlot = TRUE, filePlot = "GLM.pdf")

MDS plot

A Multidimensional Scaling (MDS) plot can be created using the MDSplot function. The input data frame can be transformed with either the rlog or Variance-stabilizing Transformation (vst) methods from the DESeq2 package. From the input data, it computes a spearman correlation-based distance matrix and performs MDS analysis on it.

exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="rlog")
MDSplot(exploredds, plotly = FALSE)

t-SNE plot

A Barnes-Hut t-Distributed Stochastic Neighbor Embedding (t-SNE) plot can be created using the tSNEplot function, which uses the Rtsne package [@Krijthe2015] to compute t-SNE values. The function removes duplicates in the input data frame, sets a seed for reproducibility, performs an initial PCA step. The function also allows for a user-set perplexity value for the computation.

targetspath <- system.file("extdata", "targets.txt", package="systemPipeR")
targets <- read.delim(targetspath, comment="#")
cmp <- systemPipeR::readComp(file=targetspath, format="matrix", delim="-")
countMatrixPath <- system.file("extdata", "countDFeByg.xls", package="systemPipeR")
countMatrix <- read.delim(countMatrixPath, row.names=1)
set.seed(42) ## Set a seed if you want reproducible results
tSNEplot(countMatrix, targets, perplexity = 5)

MA-Plot

An MA plot is an application of a Bland–Altman plot for visual representation of genomic data. The plot visualizes the differences between measurements taken in two samples, by transforming the data onto M (log ratio) and A (mean average) scales, then plotting these values.

exploredds <- exploreDDS(countMatrix, targets, cmp=cmp[[1]], preFilter=NULL, transformationMethod="raw")
MAplot(exploredds, plotly = FALSE)
MAplot(exploredds, plotly = TRUE)

Volcano plot

A volcano plot of DEGs data frame can be plotted using the function volcanoplot. Using the resulting data frame from run_edgeR or run_deseq2, the function plots a volcano plot using False Discovery Rate and Log Fold Change thresholds for the sample comparison specified by the user.

### DEG analysis with `systemPipeR`
degseqDF <- systemPipeR::run_DESeq2(countDF = countMatrix, targets = targets, cmp = cmp[[1]], independent = FALSE)
DEG_list <- systemPipeR::filterDEGs(degDF = degseqDF, filter = c(Fold = 2, FDR = 10))
## Plot
volcanoplot(degseqDF, comparison = "M12-A12", filter = c(Fold = 2, FDR = 10))
volcanoplot(degseqDF, comparison = "M12-A12", filter = c(Fold = 1, FDR = 20), genes = "ATCG00280")

Barplot

A barplot for analysis of differentially expressed genes (DEGs) can be plotted using functions deg_edgeR or deg_deseq2. The function deg_edgeR uses the edgeR package [@Robinson2010-uk] to create an edgeR data frame. Alternatively, the function deg_deseq2 uses the DESeq2 package [@Love2014-sh] to create an DESeq2 data frame. Using the filterDEGs function, it filters and plots DEG results for up and down regulated genes in a barplot.

Version Information

sessionInfo()

Funding

References



systemPipeR/spsBio documentation built on Oct. 2, 2020, 9:30 a.m.