View source: R/Identify_VCF_file.R
identify_vcf_file | R Documentation |
Identifies a cancer cell lines contained in a vcf file based on the pattern (start & length) of all contained mutations/ variations.
identify_vcf_file( vcf_file, output_file, ref_gen, minimum_matching_mutations, mutational_weight_inclusion_threshold, write_xls, output_bed_file, top_hits_per_library, manual_identifier, verbose, p_value, confidence_score, n_threads, write_results )
vcf_file |
Input vcf file. Only one sample column allowed. |
output_file |
Path of the output file. If blank, autogenerated as name of input file plus '_uniquorn_ident.tab' suffix. |
ref_gen |
Reference genome version. All training sets are associated with a reference genome version. Default: GRCH37 |
minimum_matching_mutations |
The minimum amount of mutations that has to match between query and training sample for a positive prediction |
mutational_weight_inclusion_threshold |
Include only mutations with a weight of at least x. Range: 0.0 to 1.0. 1= unique to CL. ~0 = found in many CL samples. |
write_xls |
Create identification results additionally as xls file for easier reading |
output_bed_file |
If BED files for IGV visualization should be created for the Cancer Cell lines that pass the threshold |
top_hits_per_library |
Limit the number of significant similarities per library to n (default 3) many hits. Is particularrly used in contexts when heterogeneous query and reference CCLs are being compared. |
manual_identifier |
Manually enter a vector of CL name(s) whose bed files should be created, independently from them passing the detection threshold |
verbose |
Print additional information |
p_value |
Required p-value for identification. Note that if you set the confidence score, the confidence score overrides the p-value |
confidence_score |
Cutoff for positive prediction between 0 and 100. Calculated by transforming the p-value by -1 * log(p-value) Note that if you set the confidence score, the confidence score overrides the p-value |
n_threads |
Number of threads to be used |
write_results |
Write identification results to file |
identify_vcf_file
parses the vcf file and predicts
the identity of the sample
R table with a statistic of the identification result
HT29_vcf_file = system.file("extdata/HT29.vcf", package = "Uniquorn"); identification = identify_vcf_file( vcf_file = HT29_vcf_file, verbose = FALSE, write_results = FALSE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.