map_species: Standardise species names

View source: R/map_species.R

map_speciesR Documentation

Standardise species names

Description

Search gprofiler database for species that match the input text string. Then translate to a standardised species ID.

Usage

map_species(
  species = NULL,
  search_cols = c("display_name", "id", "scientific_name", "taxonomy_id"),
  output_format = c("scientific_name", "id", "display_name", "taxonomy_id", "version",
    "scientific_name_formatted"),
  method = c("homologene", "gprofiler", "babelgene"),
  remove_subspecies = TRUE,
  remove_subspecies_exceptions = c("Canis lupus familiaris"),
  use_local = TRUE,
  verbose = TRUE
)

Arguments

species

Species query (e.g. "human", "homo sapiens", "hsapiens", or 9606). If given a list, will iterate queries for each item. Set to NULL to return all species.

search_cols

Which columns to search for species substring in metadata API.

output_format

Which column to return.

method

R package to use for gene mapping:

  • "gprofiler" : Slower but more species and genes.

  • "homologene" : Faster but fewer species and genes.

  • "babelgene" : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.

remove_subspecies

Only keep the first two taxonomic levels: e.g. "Canis lupus familiaris" –> "Canis lupus"

remove_subspecies_exceptions

Selected species to ignore when remove_subspecies=TRUE. e.g. "Canis lupus familiaris" –> "Canis lupus familiaris"

use_local

If TRUE default, map_species uses a locally stored version of the species metadata table instead of pulling directly from the gprofiler API. Local version may not be fully up to date, but should suffice for most use cases.

verbose

Print messages.

Value

Species ID of type output_format

Examples

ids <- map_species(species = c(
    "human", 9606, "mus musculus",
    "fly", "C elegans"
))

neurogenomics/orthogene documentation built on Jan. 30, 2024, 4:44 a.m.