Description Usage Arguments Details Value Examples
View source: R/Identify_VCF_file.R
Identifies a cancer cell lines contained in a vcf file based on the pattern (start & length) of all contained mutations/ variations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | identify_vcf_file(
vcf_file,
output_file,
ref_gen,
minimum_matching_mutations,
mutational_weight_inclusion_threshold,
write_xls,
output_bed_file,
top_hits_per_library,
manual_identifier,
verbose,
p_value,
confidence_score,
n_threads,
write_results
)
|
vcf_file |
Input vcf file. Only one sample column allowed. |
output_file |
Path of the output file. If blank, autogenerated as name of input file plus '_uniquorn_ident.tab' suffix. |
ref_gen |
Reference genome version. All training sets are associated with a reference genome version. Default: GRCH37 |
minimum_matching_mutations |
The minimum amount of mutations that has to match between query and training sample for a positive prediction |
mutational_weight_inclusion_threshold |
Include only mutations with a weight of at least x. Range: 0.0 to 1.0. 1= unique to CL. ~0 = found in many CL samples. |
write_xls |
Create identification results additionally as xls file for easier reading |
output_bed_file |
If BED files for IGV visualization should be created for the Cancer Cell lines that pass the threshold |
top_hits_per_library |
Limit the number of significant similarities per library to n (default 3) many hits. Is particularrly used in contexts when heterogeneous query and reference CCLs are being compared. |
manual_identifier |
Manually enter a vector of CL name(s) whose bed files should be created, independently from them passing the detection threshold |
verbose |
Print additional information |
p_value |
Required p-value for identification. Note that if you set the confidence score, the confidence score overrides the p-value |
confidence_score |
Cutoff for positive prediction between 0 and 100. Calculated by transforming the p-value by -1 * log(p-value) Note that if you set the confidence score, the confidence score overrides the p-value |
n_threads |
Number of threads to be used |
write_results |
Write identification results to file |
identify_vcf_file
parses the vcf file and predicts
the identity of the sample
R table with a statistic of the identification result
1 2 3 4 5 6 7 | HT29_vcf_file = system.file("extdata/HT29.vcf", package = "Uniquorn");
identification = identify_vcf_file(
vcf_file = HT29_vcf_file,
verbose = FALSE,
write_results = FALSE
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.