Description Usage Arguments Details Value Author(s) References See Also Examples
Function to run the Expectation Maximization Algorithm for inference of model parameters: cellular prevalence, normal proportion, tumour ploidy. This is a key function in the TitanCNA package and is the most computationally intense. This function makes calls to a C subroutine that allows the algorithm to be run more efficiently.
1 2 3 4 5 6 |
data |
|
params |
|
txnExpLen |
Influences prior probability of genotype transitions in the HMM. Smaller value have lower tendency to change state; however, too small and it produces underflow problems. |
txnZstrength |
Influences prior probability of clonal cluster transitions in the HMM. Smaller value have lower tendency to change clonal cluster state. |
pseudoCounts |
Small, machine precision values to add to probabilities to avoid underflow. For example, |
maxiter |
Maximum number of expectation-maximization iterations allowed. In practice, for TitanCNA, it will usually not exceed 20. |
maxiterUpdate |
Maximum number of coordinate descent iterations during the M-step (of EM algorithm) when parameters are estimated. |
normalEstimateMethod |
Specifies how to handle normal proportion estimation. Using |
estimateS |
Logical indicating whether to account for clonality and estimate subclonal events. See Details. |
estimatePloidy |
Logical indicating whether to estimate and account for tumour ploidy. |
useOutlierState |
Logical indicating whether an additional outlier state should be used. In practice, this is usually not necessary. |
likChangeThreshold |
EM convergence criteria - stop EM when complete log likelihood changes less than the proportion specified by this argument. |
verbose |
Set to FALSE to suppress program messages. |
This function is implemented with the "foreach"
package and therefore supports parallelization. See "doMC"
or "doMPI"
for some parallelization packages.
The forwards-backwards algorithm is used for the E-step in the EM algorithm. This is done using a call to a C subroutine for each chromosome. The maximization step uses maximum a posteriori (MAP) for estimation of parameters.
If the sample has absolutely no normal contamination, then assign nParams$n_0 <- 0
and use argument normalEstimateMethod="fixed"
.
estimateS
should always be set to TRUE
. If no subclonality is expected, then use loadDefaultParameters(numberClonalClusters=1)
. Using estimateS=FALSE
and loadDefaultParameters(numberClonalClusters=0)
is gives more or less the same results.
list
with components for results returned from the EM algorithm, including converged parameters, posterior marginal responsibilities, log likelihood, and original parameter settings.
n |
Converged estimate for normal contamination parameter. |
s |
Converged estimate(s) for cellular prevalence parameter(s). This value is defined as the proportion of tumour sample that does not contain the aberrant genotype. This will contrast what is output in |
var |
Converged estimates for variance parameter of the Gaussian mixtures used to model the log ratio data. |
phi |
Converged estimate for tumour ploidy parameter. |
piG |
Converged estimate for initial genotype state distribution. |
piZ |
Converged estimate for initial clonal cluster state distribution. |
muR |
Mean of binomial mixtures computed as a function of |
muC |
Mean of Gaussian mixtures computed as a function of |
loglik |
Posterior Log-likelihood that includes data likelihood and the priors. |
rhoG |
Posterior marginal probabilities for the genotype states computed during the E-step. Only the final iteration is returned as a |
rhoZ |
Posterior marginal probabilities for the clonal cluster states computed during the E-step. Only the final iteration is returned as a |
genotypeParams |
Original genotype parameters. See |
ploidyParams |
Original tumour ploidy parameters. See |
normalParams |
Original normal contamination parameters. See |
clonalParams |
Original subclonal parameters. See |
txnExpLen |
Original genotype transition expected length. See |
txnZstrength |
Original clonal cluster transition expected length. See |
useOutlierState |
Original setting indicating usage of outlier state. See |
Gavin Ha <gavinha@gmail.com>
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)
"foreach"
, "doMC"
, "doMPI"
, loadAlleleCounts
, loadDefaultParameters
, viterbiClonalCN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | message('Running TITAN ...')
#### LOAD DATA ####
infile <- system.file("extdata", "test_alleleCounts_chr2.txt",
package = "TitanCNA")
data <- loadAlleleCounts(infile)
#### LOAD PARAMETERS ####
message('titan: Loading default parameters')
numClusters <- 2
params <- loadDefaultParameters(copyNumber = 5,
numberClonalClusters = numClusters, skew = 0.1)
#### READ COPY NUMBER FROM HMMCOPY FILE ####
message('titan: Correcting GC content and mappability biases...')
tumWig <- system.file("extdata", "test_tum_chr2.wig", package = "TitanCNA")
normWig <- system.file("extdata", "test_norm_chr2.wig", package = "TitanCNA")
gc <- system.file("extdata", "gc_chr2.wig", package = "TitanCNA")
map <- system.file("extdata", "map_chr2.wig", package = "TitanCNA")
cnData <- correctReadDepth(tumWig, normWig, gc, map)
logR <- getPositionOverlap(data$chr, data$posn, cnData)
data$logR <- log(2^logR) #transform to natural log
#### FILTER DATA FOR DEPTH, MAPPABILITY, NA, etc ####
data <- filterData(data, 1:24, minDepth = 10, maxDepth = 200, map = NULL)
#### EM (FWD-BACK) TO TRAIN PARAMETERS ####
#### Can use parallelization packages ####
K <- length(params$genotypeParams$alphaKHyper)
params$genotypeParams$alphaKHyper <- rep(500, K)
params$ploidyParams$phi_0 <- 1.5
convergeParams <- runEMclonalCN(data, params,
maxiter = 3, maxiterUpdate = 500,
txnExpLen = 1e15, txnZstrength = 5e5,
useOutlierState = FALSE,
normalEstimateMethod = "map",
estimateS = TRUE, estimatePloidy = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.