suppressPackageStartupMessages({ library(ToPASeq) library(EnrichmentBrowser) library(graphite) library(BiocStyle) })
Note: the r Biocpkg("ToPASeq")
package currently undergoes a major rework due to
the change of the package maintainer. It is recommended to use the topology-based methods
implemented in the r Biocpkg("EnrichmentBrowser")
or the r Biocpkg("graphite")
package instead.
We start by loading the package.
library(ToPASeq)
For RNA-seq data, we consider transcriptome profiles of four primary human airway smooth muscle cell lines in two conditions: control and treatment with dexamethasone Himes et al., 2014.
We load the r Biocpkg("airway")
dataset
library(airway) data(airway)
For further analysis, we only keep genes that are annotated to an ENSEMBL gene ID.
airSE <- airway[grep("^ENSG", rownames(airway)),] dim(airSE) assay(airSE)[1:4,1:4]
The r Biocpkg("EnrichmentBrowser")
package incorporates established
functionality from the r Biocpkg("limma")
package for differential expression
analysis.
This involves the voom
transformation when applied to RNA-seq data.
Alternatively, differential expression analysis for RNA-seq data can also be
carried out based on the negative binomial distribution with r Biocpkg("edgeR")
and r Biocpkg("DESeq2")
.
This can be performed using the function EnrichmentBrowser::deAna
and assumes some standardized variable names:
For more information on experimental design, see the limma user's guide, chapter 9.
For the airway dataset, the GROUP variable indicates whether the cell lines have been treated with dexamethasone (1) or not (0).
airSE$GROUP <- ifelse(airway$dex == "trt", 1, 0) table(airSE$GROUP)
Paired samples, or in general sample batches/blocks, can be defined via a
BLOCK column in the colData
slot.
For the airway dataset, the sample blocks correspond to the four different cell
lines.
airSE$BLOCK <- airway$cell table(airSE$BLOCK)
For RNA-seq data, the deAna
function can be used to carry out differential
expression analysis between the two groups either based on functionality from
limma (that includes the voom
transformation), or
alternatively, the frequently used edgeR or DESeq2
package. Here, we use the analysis based on edgeR.
library(EnrichmentBrowser) airSE <- deAna(airSE, de.method="edgeR") rowData(airSE, use.names=TRUE)
Pathways are typically represented as graphs, where the nodes are genes and edges between the nodes represent interaction between genes.
The r Biocpkg("graphite")
package provides pathway collections from major
pathway databases such as KEGG, Biocarta, Reactome, and NCI.
Here, we retrieve human KEGG pathways.
library(graphite) pwys <- pathways(species="hsapiens", database="kegg") pwys
As the airway dataset uses ENSEMBL gene IDs, but the nodes of the pathways are based on NCBI Entrez Gene IDs,
nodes(pwys[[1]])
we first map the gene IDs in the airway dataset from ENSEMBL to ENTREZ IDs.
airSE <- idMap(airSE, org="hsa", from="ENSEMBL", to="ENTREZID")
Next, we define all genes with adjusted p-value below 0.01 as differentially expressed, and collect their log2 fold change for further analysis.
all <- names(airSE) de.ind <- rowData(airSE)$ADJ.PVAL < 0.01 de <- rowData(airSE)$FC[de.ind] names(de) <- all[de.ind]
This results in 2,426 DE genes - out of 11,780 genes in total.
length(all) length(de)
The Pathway Regulation Score (PRS) incorporates the pathway topology by weighting the indiviudal gene-level log2 fold changes by the number of downstream DE genes. The weighted absolute fold changes are summed across the pathway and statistical significance is assessed by permutation of genes. Ibrahim et al., 2012
res <- prs(de, all, pwys[1:100], nperm=100) head(res)
Corresponding gene weights (number of downstream DE genes) can be obtained for a pathway of choice via
ind <- grep("ErbB signaling pathway", names(pwys)) weights <- prsWeights(pwys[[ind]], de, all) weights
Inspecting the genes with maximum number of downstream DE genes
weights[weights == max(weights)]
reveals important upstream regulators including several G protein subunits such as subunit beta 2 (Entrez Gene ID 2783).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.