omics2pathlist | R Documentation |
Map a set of individual probes from different omics (i.e. SNPs, gene expression probes, CpGs etc.) into pathway such as Gene Ontology (GO) categories and KEGG.
omics2pathlist( data, pathlistDB, featureAnno = NULL, restrictUp = 200, restrictDown = 10, minPathSize = 5 )
data |
The input dataset (either data.frame or matrix). Rows are the samples, columns are the probes/genes, except that the first column is the label. If it's transcriptomic data, gene ID is the 'entrezID'. |
pathlistDB |
A list of pathways with pathway IDs and their corresponding genes ('entrezID' is used). |
featureAnno |
The annotation data stored in a data.frame for probe mapping. It must have at least two columns named 'ID' and 'entrezID'. If it's NULL, then the input probe is from transcriptomic data. |
restrictUp |
The upper-bound of the number of genes in each pathway. The default is 200. |
restrictDown |
The lower-bound of the number of genes in each pathway. The default is 10. |
minPathSize |
The minimal required number of probes in each pathway
after mapping the input data to |
If gene expression data
is the input, then featureAnno
is
NULL,since the gene IDs are already defined as column names of the
data
. Since online database is updated from time to time,
it is adivsed to make sure that the study database (e.g. pathlistDB
)
is frozen at particular time for reproducing the results.
The number of genes in each pathway can be restricted for downstream
analysis because too small pathways are sparsely distributed, and too large
pathways are often computationally intensive, and likely nonspecific.
A list of matrices with pathway IDs as the associated list member names. For each matrix, rows are the samples and columns are the probe names, except that the first column is named 'label'.
## Load data from DNA methylation methylfile <- system.file('extdata', 'methylData.rds', package='BioMM') methylData <- readRDS(methylfile) ## Annotation files for Mapping CpGs into pathways pathlistDBfile <- system.file('extdata', 'goDB.rds', package='BioMM') featureAnnoFile <- system.file('extdata', 'cpgAnno.rds', package='BioMM') pathlistDB <- readRDS(file=pathlistDBfile) featureAnno <- readRDS(file=featureAnnoFile) ## To reduce runtime pathlistDB <- pathlistDB[1:20] ## Mapping CpGs into pathway list dataList <- omics2pathlist(data=methylData, pathlistDB, featureAnno, restrictUp=100, restrictDown=20, minPathSize=10) length(dataList)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.