\newpage
CytoDx is a method that predicts clinical outcomes using single cell data without the need of cell gating. It first predicts the association between each cell and the outcome using a linear statistical model (Figure 1). The cell level predictions are then averaged within each sample to represent the sample level predictor. A second model is used to make prediction at the sample level (Figure 1). Compare to traditional gating based methods, CytoDX have multiple advantages.
In section 2, we demonstrate how to install CytoDx.
In section 3, we show an example where CytoDx is used to diagnose acute myeloid leukemia (AML). In addition to perform diagnosis, we also show that CytoDx can be used to identify the cell subsets that are associated with AML.
knitr::include_graphics("tssm_intro.jpg")
You can install the stable version of CytoDx from Bioconductor:
BiocManager::install("CytoDx")
You can also install the latest develop version of CytoDx from github:
devtools::install_github("hzc363/CytoDx")
In this example, we build a CytoDx model to diagnose acute myeloid leukemia (AML) using flow cytometry data. We train the model using data from 5 AML patients and 5 controls and test the performance in a test dataset.
The CytoDx R package contains the fcs files and the ground truth (AML or normal) that are needed for our example. We first load the ground truth.
library(CytoDx) # Find data in CytoDx package path <- system.file("extdata",package="CytoDx") # read the ground truth fcs_info <- read.csv(file.path(path,"fcs_info.csv")) # print out the ground truth knitr::kable(fcs_info)
We then read the cytometry data for training samples using the fcs2DF function.
# Find the training data train_info <- subset(fcs_info,fcs_info$dataset=="train") # Specify the path to the cytometry files fn <- file.path(path,train_info$fcsName) # Read cytometry files using fcs2DF function train_data <- fcs2DF(fcsFiles=fn, y=train_info$Label, assay="FCM", b=1/150, excludeTransformParameters= c("FSC-A","FSC-W","FSC-H","Time"))
The CytoDx is flexible to data transformations. It can be applied to rank transformed data to reduce batch effects. Here, we transform the original data to rank data.
# Perfroms rank transformation x_train <- pRank(x=train_data[,1:7],xSample=train_data$xSample) # Convert data frame into matrix. Here we included the 2-way interactions. x_train <- model.matrix(~.*.,x_train)
We use training data to build a predictive model.
# Build predictive model using the CytoDx.fit function fit <- CytoDx.fit(x=x_train, y=(train_data$y=="aml"), xSample=train_data$xSample, family = "binomial", reg = FALSE)
We first load and rank transform the test data.
# Find testing data test_info <- subset(fcs_info,fcs_info$dataset=="test") # Specify the path to cytometry files fn <- file.path(path,test_info$fcsName) # Read cytometry files using fcs2DF function test_data <- fcs2DF(fcsFiles=fn, y=NULL, assay="FCM", b=1/150, excludeTransformParameters= c("FSC-A","FSC-W","FSC-H","Time")) # Perfroms rank transformation x_test <- pRank(x=test_data[,1:7],xSample=test_data$xSample) # Convert data frame into matrix. Here we included the 2-way interactions. x_test <- model.matrix(~.*.,x_test)
We use the built CytoDx model to predict AML.
# Predict AML using CytoDx.ped function pred <- CytoDx.pred(fit,xNew=x_test,xSampleNew=test_data$xSample)
We plot the prediction. In this example, CytoDx classifies the sample into AML and normal perfectly.
# Cmbine prediction and truth result <- data.frame("Truth"=test_info$Label, "Prob"=pred$xNew.Pred.sample$y.Pred.s0) # Plot the prediction stripchart(result$Prob~result$Truth, jitter = 0.1, vertical = TRUE, method = "jitter", pch = 20, xlab="Truth",ylab="Predicted Prob of AML")
We use a decision tree to find cell subsets that are associated the AML. In this step, the original cytometry data should be used, rather than the ranked data.
# Use decision tree to find the cell subsets that are associated the AML. TG <- treeGate(P = fit$train.Data.cell$y.Pred.s0, x= train_data[,1:7])
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.