Description Usage Arguments Value See Also Examples
The distances between AA sequences is defined to be 1-score/max(score) times the median length of the input sequences. The distances between nucleotide sequences is defined to be edit_distance/max(edit_distance) times the median length of input sequences.
1 2 3 4 5 6 7 8 9 | fine_cluster_seqs(
seqs,
type = "AA",
big_memory_brute = FALSE,
method = "levenshtein",
substitution_matrix = "BLOSUM100",
cluster_fun = "none",
cluster_method = "complete"
)
|
seqs |
character vector, DNAStringSet or AAStringSet |
type |
character either |
big_memory_brute |
attempt to cluster more than 4000 sequences? Clustering is quadratic, so this will take a long time and might exhaust memory |
method |
one of 'substitutionMatrix' or 'levenshtein' |
substitution_matrix |
a character vector naming a substitution matrix available in Biostrings, or a substitution matrix itself |
cluster_fun |
|
cluster_method |
character passed to |
list
hclust()
, Biostrings::stringDist()
1 2 3 4 | fasta_path = system.file('extdata', 'demo.fasta', package='CellaRepertorium')
aaseq = Biostrings::readAAStringSet(fasta_path)[1:100]
cls = fine_cluster_seqs(aaseq, cluster_fun = 'hclust')
plot(cls$cluster)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.