View source: R/CorrectGapsAndNs.R
CorrectGapsAndNs | R Documentation |
Corrects positions in a DNAStringSet or AAStringSet of aligned haplotypes, replacing gaps and Ns (indeterminates) with the nucleotide or amino acid from the corresponding position in the reference sequence.
CorrectGapsAndNs(hseqs, ref.seq)
hseqs |
DNAStringSet or AAStringSet object with the alignment to correct. |
ref.seq |
Character vector with the reference sequence of the alignment. |
DNAStringSet or AAStringSet object with the sequences corrected. Duplicate
haplotypes may arise as a consequence of this operation.
See Recollapse
.
Mercedes Guerrero-Murillo and Josep Gregori
Gregori J, Esteban JI, Cubero M, Garcia-Cehic D, Perales C, Casillas R, Alvarez-Tejado M, Rodríguez-Frías F, Guardia J, Domingo E, Quer J. Ultra-deep pyrosequencing (UDPS) data treatment to study amplicon HCV minor variants. PLoS One. 2013 Dec 31;8(12):e83361. doi: 10.1371/journal.pone.0083361. eCollection 2013. PubMed PMID: 24391758; PubMed Central PMCID: PMC3877031.
Ramírez C, Gregori J, Buti M, Tabernero D, Camós S, Casillas R, Quer J, Esteban R, Homs M, Rodriguez-Frías F. A comparative study of ultra-deep pyrosequencing and cloning to quantitatively analyze the viral quasispecies using hepatitis B virus infection as a model. Antiviral Res. 2013 May;98(2):273-83. doi: 10.1016/j.antiviral.2013.03.007. Epub 2013 Mar 20. PubMed PMID: 23523552.
Recollapse
# Create a random reference sequence.
ref.seq <-GetRandomSeq(50)
ref.seq
# Create an alignment with gaps and Ns.
symb <- c(".","-","N")
nseqs <- 12
p <- c(0.9,0.06,0.04)
hseqs <- matrix(sample(symb,50*nseqs,replace=TRUE,prob=p),ncol=50)
hseqs <- apply(hseqs,1,paste,collapse="")
hseqs
hseqs <- DNAStringSet(hseqs)
# Apply the function and visualize the result.
cseqs <- CorrectGapsAndNs(hseqs,as.character(ref.seq))
c(ref.seq,as.character(cseqs))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.