For this tutorial we will simulate the input gene expression data, the cis and trans genes with known mediations, and we will run iEDGE to recover these mediations.
We start by generating the dataset. You can use construct_iEDGE
to
generate the iEDGE dataset from standard matrices (binary matrices for
alteration data or numeric matrices for expression data).
Please note that this simulation is optimized for the evaluation of the iEDGE methology, and its only purpose is to demonstrate how to construct an iEDGE object from scratch for use in standard analysis.
library(iEDGE) set.seed(1) # define number of samples n <- 50 # construct 'cn': binary alternations by samples matrix two # independent alterations across 50 samples cn1 <- sample(x= c(1,0), size = n, replace = T, prob = c(0.7, 0.3)) cn2 <- sample(x= c(1,0), size = n, replace = T, prob = c(0.5, 0.5)) cn <- rbind(cn1, cn2) colnames(cn) <- paste0("sample_", 1:ncol(cn)) # construct 'cis' gene expressions: genes correlated with one or more # alterations in 'cn' cis <- t(sapply(c(1,1,1,2,2,2), function(i) { 3 + cn[i, ] + rnorm(n, 0, 0.3) })) rownames(cis) <- make.unique(paste0("cis", c(1,1,1,2,2,2))) # construct 'trans-mediated' gene expression: gene correlated with # one or more alteration in 'cn' and mediated through a cis gene cis.m <- sample(rownames(cis), 10, replace = T) trans.m <- t(sapply(1:length(cis.m), function(i) 3 + cis[cis.m[i], ] + rnorm(n, 0, 0.3) )) rownames(trans.m) <- make.unique(paste0("t.m.", cis.m)) # construct 'trans-direct' gene expression: trans gene correlated # with one or more alteration in 'cn' but not mediated through cis # genes, this would be similar to construction of cis genes trans.d <- t(sapply(c(1,1,1,2,2,2), function(i) 3 + cn[i, ] + rnorm(n, 0, 0.6) )) rownames(trans.d) <- make.unique(paste0("t.d.", c(1,1,1,2,2,2))) # uncorrelated genes others <- t(sapply(1:30, function(i) 3+rnorm(n, 0, 0.3))) rownames(others) <- paste0("others_", 1:nrow(others)) # putting genes together in expression matrix gep <- rbind(cis, trans.m, trans.d, others) colnames(gep) <- paste0("sample_", 1:ncol(gep)) cisgenes <- list(cn1 = grep("cis1", rownames(cis), value = T), cn2 = grep("cis2", rownames(cis), value = T)) # the direction column in cn.fdat specifies whether to compute # one-sided or two-sided p-values in differential tests. In this # simulation we expect an increase in expression with the presence of # the alteration, hence a one-sided ("UP") p-value is computed. cn.fdat <- data.frame(cnid = rownames(cn), direction = "UP") rownames(cn.fdat) <- rownames(cn) gep.fdat <- data.frame(geneid = rownames(gep)) rownames(gep.fdat) <- rownames(gep) dat <- construct_iEDGE(cn, gep, cisgenes, transgenes = NA, cn.fdat, gep.fdat, cn.pdat = NA, gep.pdat = NA)
The iEDGE object is a list consisting of:
cn: Copy-Number of other binary alteration ExpressionSet:
exprs(dat\$cn): the binary matrix encoding presence or absence of an alteration (c alterations by n samples)
fData(dat\$cn): alteration annotation data frame (c alterations by g alteration annotation columns). Please note that if direction of differential expression test is dependent on the alteration, specify as such to be used in run_iEDGE, e.g. run_iEDGE(... cndir = "direction", uptest = "UP", downtest = "DOWN") corresponds to reporting one-sided pvalues for significantly upregulated in genes for alterations in which fData(dat\$cn)[, "direction"] == "UP", or significantly downregulated genes for alterations in which fData(dat\$cn)[, "direction"] == "DOWN"
pData(dat\$cn): sample annotation data frame (n samples by l sample annotation columns)
gep: The gene expression ExpressionSet:
exprs(dat\$gep): gene expression matrix, assumed to be already log2 transformed (m genes by n samples)
fData(dat\$gep): gene annotation data frame (m genes by j gene annotation columns)
pData(dat\$gep): sample annotation data frame (n samples by k sample annotation columns)
cisgenes: A list of character vectors specifying the genes that in the cis regions of each alteration, must be in the same order as the rows in exprs(dat$cn). For copy number alterations we defined the cis genes to be genes in the focal peaks of each amplification or deletion of interest.
transgenes (optional): A list of character vectors specifying the genes outside the cis regions (trans genes). If not specified, all genes in the expression matrix that are not the cis genes are considered. Trans genes specification is not recommended for exploratory analysis but may be useful for testing mediation and pathway enrichment for particular target gene sets in hypothesis-driven approaches.
dat
Now that the iEDGE object is created, we can run the iEDGE pipeline on this dataset.
This will create the results object with differential expression tables for cis and trans genes
# cis and trans differential expression analysis only # no mediation analysis # no html report res <- run_iEDGE(dat = dat, header = "testrun", outdir = ".", cnid = "cnid", cndesc = "cnid", cndir = "direction", uptest = "UP", downtest = "DOWN", gepid = "geneid", enrichment = FALSE, bipartite = FALSE, html = FALSE)
names(res) head(res$de$cis$full) head(res$de$trans$full)
This will create the results object with differential expression tables for cis and trans genes only, and an associated HTML report
# cis and trans differential expression analysis only # no mediation analysis # do html report res <- run_iEDGE(dat = dat, header = "testrun", outdir = ".", cnid = "cnid", cndesc = "cnid", cndir = "direction", uptest = "UP", downtest = "DOWN", gepid = "geneid", enrichment = FALSE, bipartite = FALSE, html = TRUE) # the result html page should be opened automatically, this displays the graphical view of results
This will create the results object with differential expression tables for cis and trans genes and mediation analysis, and an associated HTML report
# cis and trans differential expression analysis only # do mediation analysis # do html report res <- run_iEDGE(dat = dat, header = "testrun", outdir = ".", cnid = "cnid", cndesc = "cnid", cndir = "direction", uptest = "UP", downtest = "DOWN", gepid = "geneid", enrichment = FALSE, bipartite = TRUE, html = TRUE) # significant cn-cis-trans connections found using mediation analysis res$pruning$sig
The result html page should open automatically, and it will include an additional column "bipartite" with hyperlinks to the graphical display of the mediation analysis.
The HTML report of the final run is contained here
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.