View source: R/TCGA_Download_Preprocess.R
TCGA_Preprocess_GeneExpression | R Documentation |
Pre-processes gene expression data from TCGA.
TCGA_Preprocess_GeneExpression(
CancerSite,
MAdirectories,
mode = "Regular",
doBatchCorrection = FALSE,
batch.correction.method = "Seurat",
MissingValueThresholdGene = 0.3,
MissingValueThresholdSample = 0.1,
cores = 1
)
CancerSite |
character string indicating the TCGA cancer code. |
MAdirectories |
character vector with directories with the downloaded data. It can be the object returned by the GEO_Download_GeneExpression function. |
mode |
character string indicating whether the genes in the gene expression data are miRNAs or lncRNAs. Should be either 'Regular', 'Enhancer', 'miRNA' or 'lncRNA'. This value should be consistent with the same parameter in the TCGA_Download_GeneExpression function. Default: 'Regular'. |
doBatchCorrection |
logical indicating whether to perform batch effect correction. Default: False. |
batch.correction.method |
character string indicating the method to perform batch correction. The value should be either 'Seurat' or 'Combat'. Default: 'Seurat'. Seurat is much fatster than the Combat. |
MissingValueThresholdGene |
threshold for missing values per gene. Genes with a percentage of NAs greater than this threshold are removed. Default is 0.3. |
MissingValueThresholdSample |
threshold for missing values per sample. Samples with a percentage of NAs greater than this threshold are removed. Default is 0.1. |
cores |
integer indicating the number of cores to be used for performing batch correction with Combat |
Pre-process includes eliminating samples and genes with too many NAs, imputing NAs, and doing Batch correction. If the rownames of the gene expression data are ensembl ENSG names or ENST names, the function will convert them to the human gene symbol (HGNC).
pre-processed gene expression data matrix.
# Example #1: Preprocessing gene expression for Regular mode
GEdirectories <- TCGA_Download_GeneExpression(CancerSite = 'OV',
TargetDirectory = tempdir())
GEProcessedData <- TCGA_Preprocess_GeneExpression(CancerSite = 'OV',
MAdirectories = GEdirectories)
# Example #2: Preprocessing gene expression for miRNA mode
GEdirectories <- TCGA_Download_GeneExpression(CancerSite = 'OV',
TargetDirectory = tempdir(),
mode = 'miRNA')
GEProcessedData <- TCGA_Preprocess_GeneExpression(CancerSite = 'OV',
MAdirectories = GEdirectories,
mode = 'miRNA')
# Example #3: Preprocessing gene expression for lncRNA mode
GEdirectories <- TCGA_Download_GeneExpression(CancerSite = 'OV',
TargetDirectory = tempdir(),
mode = 'lncRNA')
GEProcessedData <- TCGA_Preprocess_GeneExpression(CancerSite = 'OV',
MAdirectories = GEdirectories,
mode = 'lncRNA')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.