Description Usage Arguments Details Value Author(s) Examples
View source: R/sim_functions.R
Generate the ECDF of the test statistic under the null distribution - taking the average rates of clonal exclusivity, as well as sampling from the real data for each patient, in how many trees a pair occurs and is clonally excl.
1 2 3 | generate_ecdf_test_stat(avg_rates_m, list_of_num_trees_all_pats,
list_of_clon_excl_all_pats, num_pat_pair_max, num_pairs_sim,
beta_distortion = 1000)
|
avg_rates_m |
The average rates of clonal exclusivity from all the patients in the cohort, and averaged over several trees from the collection of tree inferences. |
list_of_num_trees_all_pats |
A named list that contains an entry for
each patient which is the vector with the values of the
information from each pair in a patient of how often it was mutated across
trees. The patient odering in the list has to be the
same as in |
list_of_clon_excl_all_pats |
A named list with an entry for each
patient that is a vector with the values of in how many trees
a pair was clonally exclusive. The patient ordering in the list has to
be the same as in |
num_pat_pair_max |
The maximum number of patients a pair is mutated in. |
num_pairs_sim |
The number of simulated gene/pathway pairs to be generated, i.e. the number of times the test statistic is computed. Recommended to choose a big number, e.g. 100000. |
beta_distortion |
The value |
This function takes the computed average rates of clonal exclusivity from
the data (m1, ... mN), which are specific to each
patient and averaged over several trees from the collection of tree
inferences. It also takes the histogram for each patient, of the values of
how often a pair was clonally exclusive over the number of trees it was
mutated in. It then simulates the test statistic under the
null for each number of patients a pair is be mutated in from 2, 3, ...
'num_pat_pair_max'. Afterwards, it generates the empirical cumulative
distribution function (ECDF) using the ecdf
function of the stats
package and returns the list with the ECDF's for the
number of patients n=2, 3, ..., N. This step is necessary for each new
data set before the clonal exclusivity test can be
done. In the clonal exclusivity test, the observed test statistics are
compared to the ECDF.
The return value is a list with ECDF's. The first list entry is just set to NULL for technical reasons.
Ariane L. Moore
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | clone_tbl <- dplyr::tibble("file_name" =
rep(c(rep(c("fn1", "fn2"), each=3)), 2),
"patient_id"=rep(c(rep(c("pat1", "pat2"), each=3)), 2),
"altered_entity"=c(rep(c("geneA", "geneB", "geneC"), 4)),
"clone1"=c(0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0),
"clone2"=c(1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1),
"tree_id"=c(rep(5, 6), rep(10, 6)))
clone_tbl_pat1 <- dplyr::filter(clone_tbl, patient_id == "pat1")
clone_tbl_pat2 <- dplyr::filter(clone_tbl, patient_id == "pat2")
rates_exmpl_1 <- compute_rates_clon_excl(clone_tbl_pat1)
rates_exmpl_2 <- compute_rates_clon_excl(clone_tbl_pat2)
avg_rates_m <- apply(cbind(rates_exmpl_1, rates_exmpl_2), 2, mean)
names(avg_rates_m) <- c(names(rates_exmpl_1)[1], names(rates_exmpl_2)[1])
values_clon_excl_num_trees_pat1 <- get_hist_clon_excl(clone_tbl_pat1)
values_clon_excl_num_trees_pat2 <- get_hist_clon_excl(clone_tbl_pat2)
list_of_num_trees_all_pats <-
list(pat1=values_clon_excl_num_trees_pat1[[1]],
pat2=values_clon_excl_num_trees_pat2[[1]])
list_of_clon_excl_all_pats <-
list(pat1=values_clon_excl_num_trees_pat1[[2]],
pat2=values_clon_excl_num_trees_pat2[[2]])
num_pat_pair_max <- 2
num_pairs_sim <- 10
ecdf_list <- generate_ecdf_test_stat(avg_rates_m,
list_of_num_trees_all_pats, list_of_clon_excl_all_pats,
num_pat_pair_max, num_pairs_sim)
plot(ecdf_list[[2]])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.