BiocStyle::markdown() knitr::opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE)
library(miRspongeR)
MicroRNAs (miRNAs) are small non-coding RNAs (~22 nt) that control a wide range of biological processes including cancers via regulating target genes [1-5]. Therefore, it is important to uncover miRNA functions and regulatory mechanisms in cancers.
The emergence of competing endogenous RNA (ceRNA) hypothesis [6] has challenged the traditional knowledge that coding RNAs only act as targets of miRNAs. Actually, a pool of coding and non-coding RNAs that shares common miRNA biding sites competes with each other, thus act as ceRNAs to release coding RNAs from miRNAs control. These ceRNAs are also known as miRNA sponges or miRNA decoys, and include long non-coding RNAs (lncRNAs), pseudogenes, circular RNAs (circRNAs) and messenger RNAs (mRNAs), etc [7-10]. Recent studies [11, 12] have shown that miRNA sponge network and module can help to reveal the biological mechanism in cancer.
To accelerate the research of miRNA sponge, we develop an R package 'miRspongeR' to implement popular methods in the identification and analysis of miRNA sponge network and module.
In 'spongeMethod' function, We implement seven popular methods (miRHomology [13, 14], pc [15, 16], sppc [17], ppc [11], hermes [18], muTaME [19], and cernia [20]) to identify miRNA sponge interactions. The seven methods should meet a basic condition: the significance of sharing of miRNAs by each RNA-RNA pair (e.g. adjusted p-value < 0.01). Each method has its own merit due to different evaluating indicators. Thus, we present an integrate method to combine predicted miRNA sponge interactions from different methods.
We implement miRHomology method based on the homology of sharing miRNAs.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") head(miRHomologyceRInt)
The pc function considers expression data based on miRHomology method. Significantly positive miRNA sponge interaction pairs are regarded as output.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") head(pcceRInt)
We implement sppc method based on sensitivity correlation (difference between the Pearson and partial correlation coefficients).
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) sppcceRInt <- spongeMethod(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc") head(sppcceRInt)
The hermes method predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on conditional mutual information.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) hermesceRInt <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "hermes") head(hermesceRInt)
Parameter 'num_perm' is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is.
The ppc method is a variant of the hermes method. However, it predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on Partial Pearson Correlation.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) ppcceRInt <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "ppc") head(ppcceRInt)
Parameter 'num_perm' is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is.
We implement the muTaME method based on the logarithm of four scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs and (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs. There is no reason to decide which score has more contribution than the rest. Thus, we calculate a combined score by adding these four scores. To evaluate the strength of each RNA-RNA pair, we further normalize the combined scores to obtain normalized scores with interval [0 1].
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") MREs <- system.file("extdata", "MREs.csv", package="miRspongeR") mres <- read.csv(MREs, header=TRUE, sep=",") muTaMEceRInt <- spongeMethod(miRTarget, mres = mres, method = "muTaME") head(muTaMEceRInt)
We implement the cernia method based on the logarithm of seven scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs, (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs, (5) the density of the hybridization energies related to MREs for all shared miRNAs, (6) the DT-Hybrid recommendation scores and (7) the pairwise Pearson correlation between putative RNA-RNA pair expression data. There is no reason to decide which score has more contribution than the rest. Thus, we calculate a combined score by adding these seven scores. To evaluate the strength of each RNA-RNA pair, we further normalize the combined scores to obtain normalized scores with interval [0 1].
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") MREs <- system.file("extdata", "MREs.csv", package="miRspongeR") mres <- read.csv(MREs, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) cerniaceRInt <- spongeMethod(miRTarget, ExpData, mres, method = "cernia") head(cerniaceRInt)
To obtain a list of high-confidence miRNA sponge interactions, we implement 'integrateMethod' function to integrate results of different methods.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") sppcceRInt <- spongeMethod(miRTarget, ExpData, method = "sppc") Interlist <- list(miRHomologyceRInt[, 1:2], pcceRInt[, 1:2], sppcceRInt[, 1:2]) IntegrateceRInt <- integrateMethod(Interlist, Intersect_num = 2) head(IntegrateceRInt)
Parameter 'Intersect_num' is used to set the least number of methods intersected for integration. That is to say, we only reserve those miRNA sponge interactions predicted by at least 'Intersect_num' methods.
To validate the predicted miRNA sponge interactions, we implement 'spongeValidate' function. The groundtruth of miRNA sponge interactions are from miRSponge (http://www.bio-bigdata.net/miRSponge/) and the experimentally validated miRNA sponge interactions of related literatures.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") Groundtruthcsv <- system.file("extdata", "Groundtruth.csv", package="miRspongeR") Groundtruth <- read.csv(Groundtruthcsv, header=TRUE, sep=",") spongenetwork_validated <- spongeValidate(miRHomologyceRInt[, 1:2], directed = FALSE, Groundtruth) spongenetwork_validated
To further understand the module-level properties of miRNA sponges in cancer, we implement 'netModule' function to identify miRNA sponge modules. Users can choose FN [21], MCL [22], LINKCOMM [23] and MCODE [24] for module identification.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2) spongenetwork_Cluster
We implement 'moduleDEA' function to make disease enrichment analysis of modules. The disease databases used include DO: Disease Ontology database (http://disease-ontology.org/), DGN: DisGeNET database (http://www.disgenet.org/) and NCG: Network of Cancer Genes database (http://ncg.kcl.ac.uk/). Moreover, 'moduleFEA' function is implemented to conduct functional GO, KEGG and Reactome enrichment analysis of modules. The ontology databases used contain GO: Gene Ontology database (http://www.geneontology.org/), KEGG: Kyoto Encyclopedia of Genes and Genomes Pathway Database (http://www.genome.jp/kegg/), and Reactome: Reactome Pathway Database (http://reactome.org/).
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2) sponge_Module_DEA <- moduleDEA(spongenetwork_Cluster) sponge_Module_FEA <- moduleFEA(spongenetwork_Cluster)
To make survival analysis of miRNA sponge modules, we implement 'moduleSurvival' function.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",", stringsAsFactors = TRUE) SurvDatacsv <- system.file("extdata", "SurvData.csv", package="miRspongeR") SurvData <- read.csv(SurvDatacsv, header=TRUE, sep=",") pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") spongenetwork_Cluster <- netModule(pcceRInt[, 1:2], modulesize = 2) sponge_Module_Survival <- moduleSurvival(spongenetwork_Cluster, ExpData, SurvData, devidePercentage=.5) sponge_Module_Survival
Parameter 'devidePercentage' is used to set the percentage of high risk group.
miRspongeR provides several functions to study miRNA sponge (also called ceRNA or miRNA decoy), including popular methods for identifying miRNA sponge interactions, and the integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of modules, and conduct survival analysis of modules. It could provide a useful tool for the research of miRNA sponges.
[1] Ambros V. microRNAs: tiny regulators with great potential. Cell, 2001, 107:823–6.
[2] Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 2004, 116:281–97.
[3] Du T, Zamore PD. Beginning to understand microRNA function. Cell Research, 2007, 17:661–3.
[4] Esquela-Kerscher A, Slack FJ. Oncomirs—microRNAs with a role in cancer. Nature Reviews Cancer, 2006, 6:259–69.
[5] Lin S, Gregory RI. MicroRNA biogenesis pathways in cancer. Nature Reviews Cancer, 2015, 15:321–33.
[6] Salmena L, Poliseno L, Tay Y, et al. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell, 2011, 146(3):353-8.
[7] Cesana M, Cacchiarelli D, Legnini I, et al. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell, 2011, 147:358–69.
[8] Poliseno L, Salmena L, Zhang J, et al. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 2010, 465:1033–8.
[9] Hansen TB, Jensen TI, Clausen BH, et al. Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495:384–8.
[10] Memczak S, Jens M, Elefsinioti A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495:333–8.
[11] Le TD, Zhang J, Liu L, et al. Computational methods for identifying miRNA sponge interactions. Brief Bioinform., 2017, 18(4):577-590.
[12] Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of ceRNA crosstalk and competition. Nature, 2014, 505:344–52.
[13] Li JH, Liu S, Zhou H, et al. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res., 2014, 42(Database issue):D92-7.
[14] Sarver AL, Subramanian S. Competing endogenous RNA database. Bioinformation, 2012, 8(15):731-3.
[15] Zhou X, Liu J, Wang W, Construction and investigation of breast-cancer-specific ceRNA network based on the mRNA and miRNA expression data. IET Syst Biol., 2014, 8(3):96-103.
[16] Xu J, Li Y, Lu J, et al. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic Acids Res., 2015, 43(17):8169-82.
[17] Paci P, Colombo T, Farina L, Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst Biol., 2014, 8:83.
[18] Sumazin P, Yang X, Chiu HS, et al. An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell, 2011, 147(2):370-81.
[19] Tay Y, Kats L, Salmena L, et al. Coding-independent regulation of the tumor suppressor PTEN by competing endogenous mRNAs. Cell, 2011, 147(2):344-57.
[20] Sardina DS, Alaimo S, Ferro A, et al. A novel computational method for inferring competing endogenous interactions. Brief Bioinform., 2017, 18(6):1071-1081.
[21] Clauset A, Newman ME, Moore C. Finding community structure in very large networks. Phys Rev E Stat Nonlin Soft Matter Phys., 2004, 70(6 Pt 2):066111.
[22] Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res., 2002, 30(7):1575-84.
[23] Kalinka AT, Tomancak P. linkcomm: an R package for the generation, visualization, and analysis of link communities in networks of arbitrary size and type. Bioinformatics, 2011, 27(14):2011-2.
[24] Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 2003, 4:2.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.