Description Usage Arguments Details Value Author(s) References See Also Examples
Compute the S_Dbw Validity Index internal cluster validation from the TitanCNA results to use for model selection.
1 2 3 | computeSDbwIndex(x, centroid.method = "median",
data.type = "LogRatio", S_Dbw.method = "Halkidi",
symmetric = TRUE)
|
x |
Formatted TitanCNA results output from |
centroid.method |
|
data.type |
Compute S_Dbw validity index based on copy number (use ‘ |
symmetric |
|
S_Dbw.method |
Compute S_Dbw validity index using |
S_Dbw Validity Index is an internal clustering evaluation that is used for model selection (Halkidi et al. 2002). It attempts to choose the model that minimizes within cluster variances (scat) and maximizes density-based cluster separation (Dens). Then, S_Dbw(|c_T|x z)=Dens(|c_T|x z)+scat(|c_T|x z).
In the context of TitanCNA, if data.type
=‘LogRatio
’, then the S_Dbw internal data consists of copy number log ratios, and the resulting joint states of copy number (c_T, forall c_T in {0 : max.copy.number}) and clonal cluster (z) make up the clusters in the internal evaluation. If data.type
=‘AllelicRatio
’, then the S_Dbw internal data consists of the allelic ratios. The optimal TitanCNA run is chosen as the run with the minimum S_Dbw. If data.type
=‘Both
’, then the sum of the S_Dbw for ‘LogRatio
’ and ‘AllelicRatio
’ are added together. This helps account for both data types when choosing the optimal solution.
Note that for S_Dbw.method
, the Tong
method has an incorrect formulation of the scat(c)
function. The function should be a weighted sum, but that is not the formulation shown in the publication. computeSDbwIndex
uses (ni/N)
instead of (N-ni)/N
in the scat
formula, where ni
is the number of datapoints in cluster i
and N
is the total number of datapoints.
list
with components:
dens.bw |
density component of S_Dbw index |
scat |
scatter component of S_Dbw index |
S_DbwIndex |
Sum of dens.bw and scat. |
Gavin Ha <gavinha@gmail.com>
Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2002). Clustering validity checking methods: part ii. SIGMOD Rec., 31(3):19–27.
Tong, J. and Tan, H. Clustering validity based on the improved S_Dbw* index. (2009). Journal of Electronics (China), Volume 26, Issue 2, pp 258-264.
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)
outputModelParameters
, loadAlleleCounts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | data(EMresults)
#### COMPUTE OPTIMAL STATE PATH USING VITERBI ####
#options(cores=1)
optimalPath <- viterbiClonalCN(data, convergeParams)
#### FORMAT RESULTS ####
results <- outputTitanResults(data, convergeParams, optimalPath,
filename = NULL, posteriorProbs = FALSE,
correctResults = TRUE,
proportionThreshold = 0.05,
proportionThresholdClonal = 0.05)
results <- results$corrResults ## use corrected results
#### COMPUTE S_Dbw Validity Index FOR MODEL SELECTION ####
s_dbw <- computeSDbwIndex(results, data.type = "LogRatio",
centroid.method = "median", S_Dbw.method = "Tong")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.