Description Usage Arguments Details Value Author(s) Examples
View source: R/designSampleSizeClassificationPlots.R
To illustrate the mean classification accuracy and protein importance under different sample sizes through predictive accuracy plot and protein importance plot.
1 2 3 4 5 6 7 8 9 |
data |
A list of outputs from function |
optimal_threshold |
The maximal cutoff for deciding the optimal sample size. Default is 0.0001. Large cutoff can lead to smaller optimal sample size whereas small cutoff produces large optimal sample size. |
num_important_proteins_show |
The number of proteins to show in protein importance plot. |
protein_importance_plot |
TRUE(default) draws protein importance plot. |
predictive_accuracy_plot |
TRUE(default) draws predictive accuracy plot. |
save.pdf |
A logical input, determines to save the plots as a pdf or not, the pdf plot is saved in the current working directory, name of the created file is displayed on the console and logged for easier access |
... |
Arguements that can be passed to ggplot2::theme functions to alter the visuals |
This function visualizes for sample size calculation in classification.
Mean predictive accuracy and mean protein importance under each sample size
is from the input ‘data’, which is the output from function
designSampleSizeClassification
.
To illustrate the mean predictive accuracy and protein importance under different sample sizes, it generates two types of plots in pdf files as output: (1) The predictive accuracy plot, The X-axis represents different sample sizes and y-axis represents the mean predictive accuracy. The reported sample size per condition can be used to design future experiment
(2) The protein importance plot includes multiple subplots. The number of subplots is equal to ‘list_samples_per_group’. Each subplot shows the top 'num_important_proteins_show' most important proteins under each sample size. The Y-axis of each subplot is the protein name and X-axis is the mean protein importance under the sample size.
predictive accuracy plot is the mean predictive accuracy under different sample sizes. The X-axis represents different sample sizes and y-axis represents the mean predictive accuracy.
protein importance plot includes multiple subplots. The number of subplots is equal to 'list_samples_per_group'. Each subplot shows the top 'num_important_proteins_show' most important proteins under each sample size. The Y-axis of each subplot is the protein name and X-axis is the mean protein importance under the sample size.
a numeric value which is the estimated optimal sample size per group for the input dataset for classification problem.
Ting Huang, Meena Choi, Sumedh Sankhe, Olga Vitek.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | data(OV_SRM_train)
data(OV_SRM_train_annotation)
# simulate different sample sizes
# 1) 10 biological replicats per group
# 2) 25 biological replicats per group
# 3) 50 biological replicats per group
# 4) 100 biological replicats per group
list_samples_per_group <- c(10, 25, 50, 100)
# save the simulation results under each sample size
multiple_sample_sizes <- list()
for(i in seq_along(list_samples_per_group)){
# run simulation for each sample size
simulated_datasets <- simulateDataset(data = OV_SRM_train,
annotation = OV_SRM_train_annotation,
log2Trans = FALSE,
num_simulations = 10, # simulate 10 times
samples_per_group = list_samples_per_group[i],
protein_rank = "mean",
protein_select = "high",
protein_quantile_cutoff = 0.0,
expected_FC = "data",
list_diff_proteins = NULL,
simulate_valid = FALSE,
valid_samples_per_group = 50)
# run classification performance estimation for each sample size
res <- designSampleSizeClassification(simulations = simulated_datasets,
parallel = TRUE)
# save results
multiple_sample_sizes[[i]] <- res
}
## make the plots and save them to disk
designSampleSizeClassificationPlots(data = multiple_sample_sizes, save.pdf = TRUE)
## make accuracy plot print in the Plots panes
designSampleSizeClassificationPlots(data = multiple_sample_sizes, predictive_accuracy_plot = TRUE)
## make accuracy plot print in the Plots panes
designSampleSizeClassificationPlots(data = multiple_sample_sizes, =predictive_accuracy_plot = T)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.