est_power_distribution: est_power_distribution
In RnaSeqSampleSize: RnaSeqSampleSize

Description Usage Arguments Details Value Examples

View source: R/distribution.R

A function to estitamete the power for differential expression analysis of RNA-seq data.

est_power_distribution(
  n,
  f = 0.1,
  m = 10000,
  m1 = 100,
  w = 1,
  rho = 2,
  repNumber = 100,
  dispersionDigits = 1,
  distributionObject,
  libSize,
  minAveCount = 5,
  maxAveCount = 2000,
  selectedGenes,
  pathway,
  species = "hsa",
  storeProcess = FALSE,
  countFilterInRawDistribution = TRUE,
  selectedGeneFilterByCount = FALSE,
  removedGene0Power = TRUE
)

`n`	Numer of samples.
`f`	FDR level
`m`	Total number of genes for testing.
`m1`	Expected number of prognostic genes.
`w`	Ratio of normalization factors between two groups.
`rho`	minimum fold changes for prognostic genes between two groups.
`repNumber`	Number of genes used in estimation of read counts and dispersion distribution.
`dispersionDigits`	Digits of dispersion.
`distributionObject`	A DGEList object generated by est_count_dispersion function. RnaSeqSampleSizeData package contains 13 datasets from TCGA, you can set distributionObject as any one of "TCGA_BLCA","TCGA_BRCA","TCGA_CESC","TCGA_COAD","TCGA_HNSC","TCGA_KIRC","TCGA_LGG","TCGA_LUAD","TCGA_LUSC","TCGA_PRAD","TCGA_READ","TCGA_THCA","TCGA_UCEC" to use them.
`libSize`	numeric vector giving the total count for each sample. If not specified, the libsize in distributionObject will be used.
`minAveCount`	Minimal average read count for each gene. Genes with smaller read counts will not be used.
`maxAveCount`	Maximal average read count for each gene. Genes with larger read counts will be taken as maxAveCount.
`selectedGenes`	Optianal. Name of interesed genes. Only the read counts and dispersion distribution for these genes will be used in power estimation.
`pathway`	Optianal. ID of interested KEGG pathway. Only the read counts and dispersion distribution for genes in this pathway will be used in power estimation.
`species`	Optianal. Species of interested KEGG pathway.
`storeProcess`	Logical. Store the power and n in sample size or power estimation process.
`countFilterInRawDistribution`	Logical. If the count filter will be applied on raw count distribution. If not, count filter will be applied on libSize scaled count distribution.
`selectedGeneFilterByCount`	Logical. If the count filter will be applied to selected genes when selectedGenes parameter was used.
`removedGene0Power`	Logical. When selectedGenes or pathway are used, some genes may have read count less than minAveCount and will be removed by count filter. This parameter indicates if they will be used as 0 power in power estimation. If not, they will not be used in power estimation.

A function to estitamete the power for differential expression analysis of RNA-seq data.

Average power or a list including count ,distribution and power for each gene.

#Please note here the parameter repNumber was very small (2) to make the example code faster.
#We suggest repNumber should be at least set as 100 in real analysis.
est_power_distribution(n=65,f=0.01,rho=2,distributionObject="TCGA_READ",repNumber=2)
#Power estimation based on some interested genes. We use storeProcess=TRUE to return the details for all selected genes.
selectedGenes<-names(TCGA_READ$pseudo.counts.mean)[c(1,3,5,7,9,12:30)]
powerDistribution<-est_power_distribution(n=65,f=0.01,rho=2,distributionObject="TCGA_READ",selectedGenes=selectedGenes,minAveCount=1,storeProcess=TRUE,repNumber=2)
str(powerDistribution)
mean(powerDistribution$power)
#Power estimation based on genes in interested pathway
powerDistribution<-est_power_distribution(n=65,f=0.01,rho=2,distributionObject="TCGA_READ",pathway="00010",minAveCount=1,storeProcess=TRUE,repNumber=2)
mean(powerDistribution$power)