Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/matchGeneSets.R
Creates a grouping of variables (genes) from gene sets by matching the IDs of the genes with the IDs of the members of the gene sets
1 | matchGeneSets(GeneIds, GeneSets, minlen = 25, remain = TRUE)
|
GeneIds |
Character vector. Vector of gene IDs. Can be any ID (gene symbol, entrezID, etc) as long as it matches with
IDs used in |
GeneSets |
Named list of character vectors. Each component of the list represents a named gene set. Each vector a list of member genes. |
minlen |
Integer. Minimum number of members of a gene set. |
remain |
Boolean. If |
About minlen
: to avoid overfitting in the grridge
function, we recommend to not use groups with less than 25 members, unless monotone=TRUE
is used in the grridge
function, in which case 10 members may suffice as a lower bound. About remain
:
it is often beneficial to down-weight genes that are not part of any gene set, so we recommend to use
remain=TRUE
A list the components of which contain the indices of the variables belonging to each of the groups.
Mark A. van de Wiel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | # Load data objects
data(dataWurdinger)
# Transform the data set to the square root scale
dataSqrtWurdinger <- sqrt(datWurdinger_BC)
#Standardize the transformed data
datStdWurdinger <- t(apply(dataSqrtWurdinger,1,function(x){(x-mean(x))/sd(x)}))
# A list of gene names in the primary RNAseq data
genesWurdinger <- as.character(annotationWurdinger$geneSymbol)
# We show an example of GRridge classification model by using overlapping groups,
# i.e. pathway-based grouping. Transcription factor based pathway was extracted from
# the MSigDB (Section C3: motif gene sets; subsection: transcription factor targets;
# file's name: "c3.tft.v5.0.symbols.gmt"). The gene sets are based on
# TRANSFAC version 7.5 database (http://www.gene-regulation.com/).
# Some features may belong to more than one group. The argument minlen=25 implies
# the minimum number of features in a gene set. If remain=TRUE, gene sets with less
# than 25 members are grouped to the "remainder" group. "genesWurdinger" is an object
# containing gene names from the mRNA sequencing data set.
# See help(matchGeneSets) for more detail information.
# Also see Vignette for more detail examples
## The "TFsym" is available on https://github.com/markvdwiel/GRridgeCodata
# gseTF <- matchGeneSets(genesWurdinger,TFsym,minlen=25,remain=TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.