tuneCluster.block.spls: Feature Selection Optimization for block (s)PLS method

View source: R/tuneCluster.block.spls.R

tuneCluster.block.splsR Documentation

Feature Selection Optimization for block (s)PLS method

Description

This function identify the number of feautures to keep per component and thus by cluster in mixOmics::block.spls by optimizing the silhouette coefficient, which assesses the quality of clustering.

Usage

tuneCluster.block.spls(
  X,
  Y = NULL,
  indY = NULL,
  ncomp = 2,
  test.list.keepX = NULL,
  test.keepY = NULL,
  ...
)

Arguments

X

list of numeric matrix (or data.frame) with features in columns and samples in rows (with samples order matching in all data sets).

Y

(optional) numeric matrix (or data.frame) with features in columns and samples in rows (same rows as X).

indY

integer, to supply if Y is missing, indicates the position of the matrix response in the list X.

ncomp

integer, number of component to include in the model

test.list.keepX

list of integers with the same size as X. Each entry corresponds to the different keepX value to test for each block of X.

test.keepY

only if Y is provideid. Vector of integer containing the different value of keepY to test for block Y.

...

other parameters to be included in the spls model (see mixOmics::block.spls)

Details

For each component and for each keepX/keepY value, a spls is done from these parameters. Then the clustering is performed and the silhouette coefficient is calculated for this clustering.

We then calculate "slopes" where keepX/keepY are the coordinates and the silhouette is the intensity. A z-score is assigned to each slope. We then identify the most significant slope which indicates a drop in the silhouette coefficient and thus a deterioration of the clustering.

Value

silhouette

silhouette coef. computed for every combinasion of keepX/keepY

ncomp

number of component included in the model

test.keepX

list of tested keepX

test.keepY

list of tested keepY

block

names of blocks

slopes

"slopes" computed from the silhouette coef. for each keepX and keepY, used to determine the best keepX and keepY

choice.keepX

best keepX for each component

choice.keepY

best keepY for each component

See Also

block.spls, getCluster, plotLong

Examples

demo <- suppressWarnings(get_demo_cluster())
X <- list(X = demo$X, Z = demo$Z)
Y <- demo$Y
test.list.keepX <- list("X" = c(5,10,15,20), "Z" = c(2,4,6,8))
test.keepY <- c(2:5)

# tuning
tune.block.spls <- tuneCluster.block.spls(X= X, Y= Y, 
                                          test.list.keepX= test.list.keepX, 
                                          test.keepY= test.keepY, 
                                          mode= "canonical")
keepX <- tune.block.spls$choice.keepX
keepY <- tune.block.spls$choice.keepY

# final model
block.spls.res <- mixOmics::block.spls(X= X, Y= Y, keepX = keepX, 
                             keepY = keepY, ncomp = 2, mode = "canonical")
# get clusters and plot longitudinal profile by cluster
block.spls.cluster <- getCluster(block.spls.res)

abodein/timeOmics_BioC documentation built on April 10, 2024, 10:01 a.m.