bdc_clean_names: Clean and parse scientific names

View source: R/bdc_clean_names.R

bdc_clean_namesR Documentation

Clean and parse scientific names

Description

This function is composed of a series of name-checking routines for cleaning and parsing scientific names; i.e., unify writing style. It removes 1) family names of animals or plants pre-pended to species names, 2) qualifiers denoting the uncertain or provisional status of taxonomic identification (e.g., confer, species, affinis), and 3) infraspecific terms, for example, variety (var.), subspecies (subsp), forma (f.), and their spelling variations. It also includes applications to 4) standardize names, i.e., capitalize only the first letter of the genus name and remove extra whitespaces), and 5) parse names, i.e., separate author, date, annotations from taxon name.

Usage

bdc_clean_names(sci_names, save_outputs = FALSE)

Arguments

sci_names

character string. Containing scientific names.

save_outputs

logical. Should the outputs be saved? Default = FALSE.

Details

The execution of these functions depends on the gnparser software, which is not installed automatically. Please follow this tutorial https://brunobrr.github.io/bdc/articles/help/installing_gnparser.html to install gnparser.

Terms denoting uncertainty or provisional status of taxonomic identification as well as infraspecific terms were obtained from Sigoviniet al. (2016; doi: 10.1111/2041-210X.12594). More details about the names parse process can be found in gnparser. Note: GNparser is not automatically installed. Please follow this tutorial https://brunobrr.github.io/bdc/articles/help/installing_gnparser.html to install gnpaser

Value

A five-column data.frame including

  • scientificName: original names supplied

  • .uncer_terms: indicates the presence of taxonomic uncertainty terms

  • .infraesp_names: indicates the presence of infraspecific terms

  • name_clean: scientific names resulting from the cleaning and parsing processes

  • quality: an index indicating the quality of parsing process. It ranges from 0 to 4, being 1 no problem detected, 4 serious problems detected; a value of 0 indicates no interpretable name that was not parsed).

If save_outputs == TRUE, a data.frame containing all tests of the cleaning names process and the results of the parsing names process is saved in "Output/Check/02_parse_names.csv".

See Also

Other taxonomy: bdc_filter_out_names(), bdc_query_names_taxadb()

Examples

## Not run: 
scientificName <- c(
  "Fridericia bahiensis (Schauer ex. DC.) L.G.Lohmann",
  "Peltophorum dubium (Spreng.) Taub. (Griseb.) Barneby",
  "Gymnanthes edwalliana (Pax & K.Hoffm.) Laurenio-Melo & M.F.Sales",
  "LEGUMINOSAE Senna aff. organensis (Glaz. ex Harms) H.S.Irwin & Barneby"
)

bdc_clean_names(scientificName, save_outputs = FALSE)

## End(Not run)


brunobrr/bdc documentation built on Nov. 21, 2024, 4:18 a.m.