xCell2Train: xCell2Train function

View source: R/xCell2Train.R

xCell2TrainR Documentation

xCell2Train function

Description

This function generates a custom xCell2 reference object for cell type enrichment analysis.

Usage

xCell2Train(
  ref,
  mix = NULL,
  labels = NULL,
  refType,
  lineageFile = NULL,
  numThreads = 1,
  useOntology = TRUE,
  returnSignatures = FALSE,
  returnAnalysis = FALSE,
  useSpillover = TRUE,
  spilloverAlpha = 0.5,
  minPbCells = 30,
  minPbSamples = 10,
  minScGenes = 10000,
  topSpillValue = 0.5
)

Arguments

ref

A reference gene expression matrix (with genes in rows and samples/cells in columns), a SummarizedExperiment object, or a SingleCellExperiment object containing the expression data and sample metadata. If a SummarizedExperiment or SingleCellExperiment object is provided, the expression matrix should be stored in the "counts" slot of the 'assays' component, and the sample metadata (equivalent to the "labels" parameter) should be stored in 'colData'.

mix

A bulk mixture of gene expression data (genes in rows, samples in columns) (optional).

labels

A data frame in which the rows correspond to samples in the ref. The data frame must have four columns: "ont": the cell type ontology as a character (i.e., "CL:0000545" or NA if there is no ontology). "label": the cell type name as a character (i.e., "T-helper 1 cell"). "sample": the cell type sample/cell that match the column name in ref. "dataset": sample's source dataset or subject (can be the same for all samples if no such information). This parameter is not needed if ref is a SummarizedExperiment or SingleCellExperiment object, as it should already be included in 'colData'.

refType

The reference gene expression data type: "rnaseq" for bulk RNA-Seq, "array" for micro-array, or "sc" for scRNA-Seq.

lineageFile

Path to the cell type lineage file generated with 'xCell2GetLineage' function and reviewed manually (optional).

numThreads

Number of threads for parallel processing (default: 1).

useOntology

A Boolean for considering cell type dependencies by using ontological integration (default: TRUE).

returnSignatures

A Boolean to return just the signatures (default: FALSE).

returnAnalysis

A Boolean to return the xCell2Analysis results (do not return reference object) (default: FALSE).

useSpillover

A Boolean to use spillover correction in xCell2Analysis (returnAnalysis must be TRUE) (default: TRUE).

spilloverAlpha

A numeric for spillover alpha value in xCell2Analysis (returnAnalysis must be TRUE) (default: 0.5).

minPbCells

For scRNA-Seq reference only - minimum number of cells in the pseudo-bulk (optional, default: 30).

minPbSamples

For scRNA-Seq reference only - minimum number of pseudo-bulk samples (optional, default: 10).

minScGenes

For scRNA-Seq reference only - minimum number of genes for pseudo-bulk samples (default: 10000).

topSpillValue

Maximum spillover compensation correction value (default: 0.5).

Value

An S4 object containing cell types' signatures, linear transformation parameters, spillover matrix and dependencies.

Examples

# For detailed example read xCell2 vignette.

# Extract reference matrix
data(dice_demo_ref, package = "xCell2")
dice_ref <- as.matrix(dice_demo_ref@assays@data$logcounts)
colnames(dice_ref) <- make.unique(colnames(dice_ref)) # Make samples samples unique

# Extract reference metadata
dice_labels <- as.data.frame(dice_demo_ref@colData)

# Prepare labels data frame
dice_labels$ont <- NA
dice_labels$sample <- colnames(dice_ref)
dice_labels$dataset <- "DICE"

# Assign cell type ontology (optional but recommended)
dice_labels[dice_labels$label == "B cells", ]$ont <- "CL:0000236"
dice_labels[dice_labels$label == "Monocytes", ]$ont <- "CL:0000576"
dice_labels[dice_labels$label == "NK cells", ]$ont <- "CL:0000623"
dice_labels[dice_labels$label == "T cells, CD8+", ]$ont <- "CL:0000625"
dice_labels[dice_labels$label == "T cells, CD4+", ]$ont <- "CL:0000624"
dice_labels[dice_labels$label == "T cells, CD4+, memory", ]$ont <- "CL:0000897"

# Reproducibility (optional): Set seed before running `xCell2Train`  as generating pseudo-bulk
# samples from scRNA-Seq reference based on random sampling of cells.
set.seed(123)

# Generate custom xCell2 reference object
DICE.xCell2Ref <- xCell2::xCell2Train(ref = dice_ref, labels = dice_labels, refType = "rnaseq")


AlmogAngel/xCell2 documentation built on Oct. 14, 2024, 4:51 a.m.