assign_sintax | R Documentation |
Please cite Vsearch if you use this function to assign taxonomy.
assign_sintax(
physeq = NULL,
seq2search = NULL,
ref_fasta = NULL,
behavior = "return_matrix",
vsearchpath = "vsearch",
clean_pq = TRUE,
nproc = 1,
suffix = "",
taxo_rank = c("K", "P", "C", "O", "F", "G", "S"),
min_boostrap = 0.5,
keep_temporary_files = FALSE,
verbose = TRUE,
temporary_fasta_file = "temp.fasta",
cmd_args = "--sintax_random"
)
physeq |
(required): a |
seq2search |
A DNAStringSet object of sequences to search for. |
ref_fasta |
(required) A link to a database in vsearch format The reference database must contain taxonomic information in the header of each sequence in the form of a string starting with ";tax=" and followed by a comma-separated list of up to nine taxonomic identifiers. Each taxonomic identifier must start with an indication of the rank by one of the letters d (for domain) k (kingdom), p (phylum), c (class), o (order), f (family), g (genus), s (species), or t (strain). The letter is followed by a colon (:) and the name of that rank. Commas and semicolons are not allowed in the name of the rank. Non-ascii characters should be avoided in the names. Example: \>X80725_S000004313;tax=d:Bacteria,p:Proteobacteria,c:Gammaproteobacteria,o:Enterobacteriales,f:Enterobacteriaceae,g:Escherichia/Shigella,s:Escherichia_coli,t:str._K-12_substr._MG1655 |
behavior |
Either "return_matrix" (default), "return_cmd", or "add_to_phyloseq":
|
vsearchpath |
(default: "vsearch") path to vsearch |
clean_pq |
(logical, default TRUE) If set to TRUE, empty samples and empty ASV are discarded before clustering. |
nproc |
(default: 1) Set to number of cpus/processors to use |
suffix |
(character) The suffix to name the new columns. If set to "" (the default), the taxo_rank algorithm is used without suffix. |
taxo_rank |
A list with the name of the taxonomic rank present in ref_fasta |
min_boostrap |
(Int. [0:1], default 0.5) Minimum bootstrap value to inform taxonomy. For each bootstrap below the min_boostrap value, the taxonomy information is set to NA. |
keep_temporary_files |
(logical, default: FALSE) Do we keep temporary files?
|
verbose |
(logical). If TRUE, print additional information. |
temporary_fasta_file |
The name of a temporary_fasta_file (default "temp.fasta") |
cmd_args |
Other arguments to be passed on to vsearch sintax cmd. By default cmd_args is equal to "–sintax_random" as recommended by Torognes. |
This function is mainly a wrapper of the work of others. Please cite vsearch.
See param behavior
Adrien Taudière
assign_sintax(data_fungi_mini,
ref_fasta = system.file("extdata", "mini_UNITE_fungi.fasta.gz", package = "MiscMetabar"),
behavior = "return_cmd"
)
data_fungi_mini_new <- assign_sintax(data_fungi_mini,
ref_fasta = system.file("extdata", "mini_UNITE_fungi.fasta.gz", package = "MiscMetabar"),
behavior = "add_to_phyloseq"
)
assignation_results <- assign_sintax(data_fungi_mini,
ref_fasta = system.file("extdata", "mini_UNITE_fungi.fasta.gz", package = "MiscMetabar")
)
left_join(
tidyr::pivot_longer(assignation_results$taxo_value, -taxa_names),
tidyr::pivot_longer(assignation_results$taxo_boostrap, -taxa_names),
by = join_by(taxa_names, name),
suffix = c("rank", "bootstrap")
) |>
mutate(name = factor(name, levels = c("K", "P", "C", "O", "F", "G", "S"))) |>
# mutate(valuerank = forcats::fct_reorder(valuerank, as.integer(name), .desc = TRUE)) |>
ggplot(aes(valuebootstrap,
valuerank,
fill = name
)) +
geom_jitter(alpha = 0.8, aes(color = name)) +
geom_boxplot(alpha = 0.3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.