Description Usage Arguments Details Value Author(s)
Measure background noises and perform Fisher's Exact tests to detect SNPs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | exactSNP(
# basic input/output options
readFile,
isBAM = FALSE,
refGenomeFile,
SNPAnnotationFile = NULL,
outputFile = paste0(readFile, ".exactSNP.VCF"),
# fine tuning parameters
qvalueCutoff = 12,
minAllelicFraction = 0,
minAllelicBases = 1,
minReads = 1,
maxReads = 1000000,
minBaseQuality = 13,
nTrimmedBases = 3,
nthreads = 1)
|
readFile |
a character string giving the name of a file including read mapping results. This function takes as input a SAM file by default. If a BAM file is provided, the |
isBAM |
logical indicating if the file provided via |
refGenomeFile |
a character string giving the name of a file that includes reference sequences (FASTA format). |
SNPAnnotationFile |
a character string giving the name of a VCF-format file that includes annotated SNPs (the file can be uncompressed or gzip compressed). Such annotation can be downloaded from public databases such as dbSNP. Incorporating known SNPs into SNP calling has been found to be helpful. However note that the annotated SNPs may or may not be called for the sample being analyzed. |
outputFile |
a character string giving the name of the output file to be generated by this function. The output file includes all the reported SNPs (in VCF format). It includes discovered indels as well. |
qvalueCutoff |
a numeric value giving the q-value cutoff for SNP calling at sequencing depth of 50X. |
minAllelicFraction |
a numeric value giving the minimum fraction of allelic bases out of all read bases included at a chromosomal location required for SNP calling. Its value must be within |
minAllelicBases |
an integer giving the minimum number of allelic (mis-matched) bases a SNP must have at a chromosomal location. |
minReads |
an integer giving the minimum number of mapped reads a SNP-containing location must have (ie. the minimum coverage). |
maxReads |
an integer specifying the maximum depth a SNP-containing location is allowed to have. |
minBaseQuality |
a numeric value giving the minimum base quality score (Phred score) read bases should satisfy before being used for SNP calling. |
nTrimmedBases |
a numeric value giving the number of bases trimmed off from each end of the read. |
nthreads |
a numeric value giving the number of threads/CPUs used. |
This function takes as input a SAM/BAM format file, measures local background noise for each chromosomal location and then performs Fisher's exact tests to find statistically significant SNPs .
This function implements a novel algorithm for discovering SNPs. This algorithm is comparable with or better than existing SNP callers, but it is fast more efficient. It can be used to call SNPs for individual samples (ie. no control samples are required). Detail of the algorithm is described in a manuscript which is currently under preparation.
No value is produced but but a VCF format file is written to the current working directory. This file contains detailed information for discovered SNPs including chromosomal locations, reference bases, alternative bases, read coverages, allele frequencies and p values.
Yang Liao and Wei Shi
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.