DeMixT_GS: Estimates the proportions of mixed samples for each mixing...
In wwylab/DeMixT: Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

DeMixT_GS

R Documentation

Estimates the proportions of mixed samples for each mixing component using profile likelihood gene selection

Description

This function is designed to estimate the proportions of all mixed samples for each mixing component with a new proposed profile likelihood based gene selection, which can select most identifiable genes as reference gene sets to achieve better model fitting quality. We first calculated the Hessian matrix of the parameter spaces and then derive the confidence interval of the profile likelihood of each gene. We then utilized the length of confidence interval as a metric to rank the identifiability of genes. As a result, the proposed gene selection approach can improve the tumor-specific transcripts proportion estimation.

Usage

DeMixT_GS(
  data.Y,
  data.N1,
  data.N2 = NULL,
  niter = 10,
  nbin = 50,
  if.filter = TRUE,
  filter.sd = 0.5,
  ngene.Profile.selected = NA,
  ngene.selected.for.pi = NA,
  mean.diff.in.CM = 0.25,
  nspikein = NULL,
  tol = 10^(-5),
  pi01 = NULL,
  pi02 = NULL,
  nthread = parallel::detectCores() - 1
)

Arguments

`data.Y`	A SummarizedExperiment object of expression data from mixed tumor samples. It is a `G` by `My` matrix where `G` is the number of genes and `My` is the number of mixed samples. Samples with the same tissue type should be placed together in columns.
`data.N1`	A SummarizedExperiment object of expression data from reference component 1 (e.g., normal). It is a `G` by `M1` matrix where `G` is the number of genes and `M1` is the number of samples for component 1.
`data.N2`	A SummarizedExperiment object of expression data from additional reference samples. It is a `G` by `M2` matrix where `G` is the number of genes and `M2` is the number of samples for component 2. Component 2 is needed only for running a three-component model.
`niter`	The maximum number of iterations used in the algorithm of iterated conditional modes. A larger value better guarantees the convergence in estimation but increases the running time. The default is 10.
`nbin`	The number of bins used in numerical integration for computing complete likelihood. A larger value increases accuracy in estimation but increases the running time, especially in a three-component deconvolution problem. The default is 50.
`if.filter`	The logical flag indicating whether a predetermined filter rule is used to select genes for proportion estimation. The default is TRUE.
`filter.sd`	The cut-off for the standard deviation of lognormal distribution. Genes whose log transferred standard deviation smaller than the cut-off will be selected into the model. The default is TRUE.
`ngene.Profile.selected`	The number of genes used for proportion estimation ranked by profile likelihood. The default is `min(1500,0.1*G)`, where `G` is the number of genes.
`ngene.selected.for.pi`	The percentage or the number of genes used for proportion estimation. The difference between the expression levels from mixed tumor samples and the known component(s) are evaluated, and the most differential expressed genes are selected, which is called DE. It is enabled when if.filter = TRUE. The default is `min(1500, 0.3G)`, where `G` is the number of genes. Users can also try using more genes, ranging from `0.3G` to `0.5*G`, and evaluate the outcome.
`mean.diff.in.CM`	Threshold of expression difference for selecting genes in the component merging strategy. We merge three-component to two-component by selecting genes with similar expressions for the two known components. Genes with the mean differences less than the threshold will be selected for component merging. It is used in the three-component setting, and is enabled when if.filter = TRUE. The default is 0.25.
`nspikein`	The number of spikes in normal reference used for proportion estimation. The default value is `min(200, 0.3*My)`, where `My` the number of mixed samples. If it is set to 0, proportion estimation is performed without any spike in normal reference.
`tol`	The convergence criterion. The default is 10^(-5).
`pi01`	Initialized proportion for first kown component. The default is `Null` and pi01 will be generated randomly from uniform distribution.
`pi02`	Initialized proportion for second kown component. pi02 is needed only for running a three-component model. The default is `Null` and pi02 will be generated randomly from uniform distribution.
`nthread`	The number of threads used for deconvolution when OpenMP is available in the system. The default is the number of whole threads minus one. In our no-OpenMP version, it is set to 1.

Value

`pi`	A matrix of estimated proportion. First row and second row corresponds to the proportion estimate for the known components and unkown component respectively for two or three component settings, and each column corresponds to one sample.
`pi.iter`	Estimated proportions in each iteration. It is a `niter Myp` array, where `p` is the number of components. This is enabled only when output.more.info = TRUE.
`gene.name`	The names of genes used in estimating the proportions. If no gene names are provided in the original data set, the genes will be automatically indexed.

Note

A Hessian matrix file will be created in the working directory and the corresponding Hessian matrix with an encoded name from the mixed tumor sample data will be saved under this file. If a user reruns this function with the same dataset, this Hessian matrix will be loaded to in place of running the profile likelihood method and reduce running time.

Author(s)

Shaolong Cao, Zeya Wang, Wenyi Wang

References

Gene Selection and Identifiability Analysis of RNA Deconvolution Models using Profile Likelihood. Manuscript in preparation.

Examples


# Example 1: estimate proportions for simulated two-component data 
# with spike-in normal reference
  data(test.data.2comp)
# res.GS = DeMixT_GS(data.Y = test.data.2comp$data.Y, 
#                    data.N1 = test.data.2comp$data.N1,
#                    niter = 10, nbin = 50, nspikein = 50,
#                    if.filter = TRUE, ngene.Profile.selected = 150,
#                    mean.diff.in.CM = 0.25, ngene.selected.for.pi = 150,
#                    tol = 10^(-5))
#
# Example 2: estimate proportions for simulated two-component data 
# without spike-in normal reference
# data(test.dtat.2comp)
# res.GS = DeMixT_GS(data.Y = test.data.2comp$data.Y, 
#                    data.N1 = test.data.2comp$data.N1,
#                    niter = 10, nbin = 50, nspikein = 0,
#                    if.filter = TRUE, ngene.Profile.selected = 150,
#                    mean.diff.in.CM = 0.25, ngene.selected.for.pi = 150,
#                    tol = 10^(-5))

wwylab/DeMixT documentation built on July 17, 2024, 9:14 p.m.

wwylab/DeMixT index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

wwylab/DeMixT
Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

DeMixT_GS: Estimates the proportions of mixed samples for each mixing...
In wwylab/DeMixT: Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

Estimates the proportions of mixed samples for each mixing component using profile likelihood gene selection

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to DeMixT_GS in wwylab/DeMixT...

R Package Documentation

Browse R Packages

We want your feedback!

wwylab/DeMixT Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

DeMixT_GS: Estimates the proportions of mixed samples for each mixing... In wwylab/DeMixT: Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

Estimates the proportions of mixed samples for each mixing component using profile likelihood gene selection

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to DeMixT_GS in wwylab/DeMixT...

R Package Documentation

Browse R Packages

We want your feedback!

wwylab/DeMixT
Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

DeMixT_GS: Estimates the proportions of mixed samples for each mixing...
In wwylab/DeMixT: Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms