Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/CreatePartition.R
Creates a partition (groups) of variables from nominal (factor) or numeric input
1 2 |
vec |
Factor, numeric vector or character vector. |
subset |
Character vector. Names of variables (features) that correspond to the values in |
varnamesdata |
Character vector. Names of the variables (features). Only relevant when |
grsize |
Numeric. Size of the groups. Only relevant when |
decreasing |
Boolean. If |
uniform |
Boolean. If |
ngroup |
Numeric. Number of the groups to create. Only relevant when |
mingr |
Numeric. Minimum group size. Only relevant when |
A convenience function to create partitions of variables from external information that is stored in vec
. If vec
is
a factor then the levels of the factor define the groups. If vec
is a character vector,
then varnamesdata
need to be specified (vec
is supposed to be a subset of varnamesdata
, e.g. a published gene list).
In this case a partition of two groups is created: one with those variables of varnamesdata
that also appear in vec
and
one which do not appear in vec
. If vec
is a numeric vector, then groups contain the variables corresponding to
grsize
consecutive values of the values in vec
. Alternatively, the group size
is determined automatically from ngroup
. If uniform=FALSE
, a group with rank $r$ is
of approximate size mingr*(r^f)
, where f>1
is determined such that the total number of groups equals ngroup
.
Such unequal group sizes enable the use of fewer groups (and hence faster computations) while still maintaining a
good ‘resolution’ for the extreme values in vec
. About decreasing
: if smaller values of components of vec
mean ‘less relevant’ (e.g. test statistics, absolute regression coefficients) use decreasing=TRUE
, else use decreasing=FALSE
, e.g. for p-values. If subset
is defined, then varnamesdata
should be specified as well. The parition will then only be applied to variables in subset
and in
varnamesdata
.
A list the components of which contain the indices of the variables belonging to each of the groups.
Mark A. van de Wiel
For gene sets (overlapping groups): matchGeneSets
.
Further example in real life dataset: grridge
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | #SOME EXAMPLES ON SMALL NR OF VARIABLES
#EXAMPLE 1: partition based on known gene signature
genset <- sapply(1:100,function(x) paste("Gene",x))
signature <- sapply(seq(1,100,by=2),function(x) paste("Gene",x))
SignaturePartition <- CreatePartition(signature,varnamesdata=genset)
#EXAMPLE 2: partition based on factor variable
Genetype <- factor(sapply(rep(1:4,25),function(x) paste("Type",x)))
TypePartition <- CreatePartition(Genetype)
#EXAMPLE 3: partition based on continuous variable, e.g. p-value
pvals <- rbeta(100,1,4)
#Creating a partition of 10 equally-sized groups, corresponding to increasing p-values.
PvPartition <- CreatePartition(pvals, decreasing=FALSE,uniform=TRUE,ngroup=10)
#Alternatively, create a partition of 5 unequally-sized groups,
#with minimal size at least 10. Group size
#increases with less relevant p-values.
# Recommended when nr of variables is large.
PvPartition2 <- CreatePartition(pvals, decreasing=FALSE,uniform=FALSE,ngroup=5,mingr=10)
#EXAMPLE 4: partition based on subset of variables,
#e.g. p-values only available for 50 genes.
genset <- sapply(1:100,function(x) paste("Gene",x))
subsetgenes <- sort(sapply(sample(1:100,50),function(x) paste("Gene",x)))
pvals50 <- rbeta(50,1,6)
#Returns the partition for the subset based on the indices of
#the variables in entire genset. Variables not
#present in subset will receive group-penalty = 1 for this partition.
PvPartitionSubset <- CreatePartition(pvals50, varnamesdata = genset,subset = subsetgenes,
decreasing=FALSE,uniform=TRUE, ngroup=5)
#EXAMPLE 5: COMBINING PARTITIONS
#Combines partitions into one list with named components.
#This can be use as input for the grridge() #function.
#NOTE: if one aims to use one partition only, then this can be directly used in grridge().
MyPart <- list(signature=SignaturePartition, type = TypePartition,
pval = PvPartition, pvalsubset=PvPartitionSubset)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.