knitr::opts_knit$set(root.dir = ".") knitr::opts_chunk$set(collapse = TRUE, warning = TRUE) BiocStyle::markdown() library("BiocStyle") library("GSEAdv") fl <- system.file("extdata", "Broad.xml", package = "GSEABase") gss <- getBroadSets(fl)
This vignette assumes that the reader is familiar with the infraestructure build in Bioconductor around gene sets, mainly r Biocpkg("GSEABase")
. It provides examples of how to use the package to create and modify different gene sets.
The first step towards simulate a GeneSetCollection
is to modify it. There are two ways do modifying them: adding new relationships, or removing genes and GeneSets
.
We might want to add a gene to a gene set or to add a new gene set. We can do that with:
independence(gss) gss2 <- add(gss, gene = c("ha", "ZNF474", "CCDC100"), pathway = "new") independence(gss2) isolation(gss) gss3 <- add(gss, gene = "ZNF474", pathway = c("chr16q24", "chr5q23")) isolation(gss3)
As shown the independence of genes or the isolation of the gene sets have changed.
isolation(Info) Info2 <- dropRel(Info, gene = "3", pathway = "156580") isolation(Info2)
We can see that now the Info2 object has now one pathway that is not related to other gene sets.
GeneSets
Or we might want to remove a pathway or some genes:
# Remove a pathway drop(gss, pathway = "chr5q23") drop(gss, pathway = 1) # 1 random pathway drop(gss, pathway = c("chr5q23", "chr16q24")) # Remove some genes drop(gss, gene = c("AFG3L1", "ANKRD11")) drop(gss, gene = 5) # 5 random genes # Remove both, genes and pathways drop(gss, pathway = "chr5q23", gene = c("AFG3L1", "ANKRD11")) drop(Info, pathway = "1430728", gene = c("10", "9"))
Dropping by number is aimed to make it possible to bootstrap with different GeneSetCollections
coming from the same original one.
add(Info, c("gene1", "gene2", "gene3"), "chr16q24") add(Info, c("gene1", "gene2", "10"), "chr16q24")
If you want to know how many genes or pathways are needed for such distribution of pathways per genes or genes per pathways you can use combnPPG
and combnGPP
:
combnPPG(Info) combnGPP(Info)
This indicates that with only 3 we could have such number ofpathways and that with 15 pathways we could have those genes.
It might be needed to simulate GeneSetCollections
to see the robustness of an algorithm or other GeneSetCollection
s.
The more information we have, the less we need to guess, but it becomes harder to recreate those conditions.
Some functions return more information than others, and depending on what are the input the information might be redundant.
For this reason, before simulating, first the parameters are estimated: what are the ranges of parameters, then the GeneSetCollection is build.
The information available from a GeneSetCollection has been explained in previous vignettes. It can be used to generate some other GeneSetCollections with the same
As a summary of the methods available and what kind of data they use is provided below:
method | nGenes | nPathways | genesPerPathway| pathwaysPerGene | sizeGenes | sizePathways
------- | ----- | ----- | ----- | ----- | ----- | -----
fromGPP | | (x) | x | | |
fromPPG | (x) | | | x | |
fromGPP_nGenes | x | (x) | x | | |
fromPPG_nPathways | (x) | x | x | | |
fromPPG_GPP | (x) | (x) | x | x | |
fromSizeGenes | (x) | (x) | (x) | (x) | x |
fromSizePathways | (x) | (x) | (x) | (x) | | x
: (#tab:table) The x indicates that it is a required argument of the functions. The parenthesis indicate that it is implicit.
When using more information to the simulations it takes more time (as it is randomly assigned). Also note that they must obey some simple rules: - a gene set cannot be of only one gene - a gene cannot be duplicated in a single gene set - there can't be duplicated gene sets (gene sets with the same genes)
fromGPP(genesPerPathway(Info)) fromPPG(pathwaysPerGene(Info))
fromPPG_nPathways(pathwaysPerGene(Info), nPathways(Info)) fromGPP_nGenes(genesPerPathway(Info), nGenes(Info))
r Biocpkg("GSEAdv", "vignette_describe.html", "*Description*")
To provide insight about the relationship between genes and gene sets.
r Biocpkg("GSEAdv", "vignette_evaluate.html", "*Evaluation*")
Compare and estimate the relationship between genes and gene sets.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.