normalize_scope_foreach: Normalization of read depth with latent factors using...

View source: R/normalize_scope_foreach.R

normalize_scope_foreachR Documentation

Normalization of read depth with latent factors using Expectation-Maximization algorithm under the case-control setting in parallel

Description

Fit a Poisson generalized linear model to normalize the raw read depth data from single-cell DNA sequencing, with latent factors under the case-control setting. Model GC content bias using an expectation-maximization algorithm, which accounts for the different copy number states.

Usage

normalize_scope_foreach(Y_qc, gc_qc, K, norm_index, T,
    ploidyInt, beta0, minCountQC = 20, nCores = NULL)

Arguments

Y_qc

read depth matrix after quality control

gc_qc

vector of GC content for each bin after quality control

K

Number of latent Poisson factors

norm_index

indices of normal/diploid cells

T

a vector of integers indicating number of CNV groups. Use BIC to select optimal number of CNV groups. If T = 1, assume all reads are from normal regions so that EM algorithm is not implemented. Otherwise, we assume there is always a CNV group of heterozygous deletion and a group of null region. The rest groups are representative of different duplication states.

ploidyInt

a vector of initialized ploidy return from initialize_ploidy. Users are also allowed to provide prior-knowledge ploidies as the input and to manually tune a few cells that have poor fitting

beta0

a vector of initialized bin-specific biases returned from CODEX2 without latent factors

minCountQC

the minimum read coverage required for normalization and EM fitting. Defalut is 20

nCores

number of cores to use. If NULL, number of cores is detected. Default is NULL.

Value

A list with components

Yhat

A list of normalized read depth matrix with EM

alpha.hat

A list of absolute copy number matrix

fGC.hat

A list of EM estimated GC content bias matrix

beta.hat

A list of EM estimated bin-specific bias vector

g.hat

A list of estimated Poisson latent factor

h.hat

A list of estimated Poisson latent factor

AIC

AIC for model selection

BIC

BIC for model selection

RSS

RSS for model selection

K

Number of latent Poisson factors

Author(s)

Rujin Wang rujin@email.unc.edu

Examples

Gini <- get_gini(Y_sim)

# first-pass CODEX2 run with no latent factors
normObj.sim <- normalize_codex2_ns_noK(Y_qc = Y_sim,
                                        gc_qc = ref_sim$gc,
                                        norm_index = which(Gini<=0.12))
Yhat.noK.sim <- normObj.sim$Yhat
beta.hat.noK.sim <- normObj.sim$beta.hat
fGC.hat.noK.sim <- normObj.sim$fGC.hat
N.sim <- normObj.sim$N

# Ploidy initialization
ploidy.sim <- initialize_ploidy(Y = Y_sim,
                            Yhat = Yhat.noK.sim,
                            ref = ref_sim)
ploidy.sim

# Specify nCores = 2 only for checking examples
normObj.scope.sim <- normalize_scope_foreach(Y_qc = Y_sim, 
                        gc_qc = ref_sim$gc,
                        K = 1, ploidyInt = ploidy.sim,
                        norm_index = which(Gini<=0.12), T = 1:5,
                        beta0 = beta.hat.noK.sim, nCores = 2)
Yhat.sim <- normObj.scope.sim$Yhat[[which.max(normObj.scope.sim$BIC)]]
fGC.hat.sim <- normObj.scope.sim$fGC.hat[[which.max(normObj.scope.sim$BIC)]]


rujinwang/SCOPE documentation built on Jan. 1, 2023, 5:40 a.m.