pkg <- read.dcf("DESCRIPTION", fields = "Package")[1] title <- read.dcf("DESCRIPTION", fields = "Title")[1] description <- read.dcf("DESCRIPTION", fields = "Description")[1] URL <- read.dcf('DESCRIPTION', fields = 'URL')[1] owner <- tolower(strsplit(URL,"/")[[1]][4])
r description
In brief, orthogene
lets you easily:
convert_orthologs
between any two species. map_species
names onto standard taxonomic ontologies. report_orthologs
between any two species. map_genes
onto standard ontologies aggregate_mapped_genes
in a matrix. all_genes
from any species. infer_species
from gene names. create_background
gene lists based one, two, or more species. get_silhouettes
of each species from phylopic. prepare_tree
with evolutionary divergence times across >147,000 species. If you use r pkg
, please cite:
r citation(pkg)$textVersion
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") # orthogene is only available on Bioconductor>=3.14 if(BiocManager::version()<"3.14") BiocManager::install(update = TRUE, ask = FALSE) BiocManager::install("orthogene")
orthogene
can also be installed via a Docker or Singularity
container with Rstudio pre-installed. Further instructions provided here.
library(orthogene) data("exp_mouse") # Setting to "homologene" for the purposes of quick demonstration. # We generally recommend using method="gprofiler" (default). method <- "homologene"
For most functions, orthogene
lets users choose between different methods,
each with complementary strengths and weaknesses:
"gprofiler"
, "homologene"
, and "babelgene"
In general, we recommend you use "gprofiler"
when possible,
as it tends to be more comprehensive.
While "babelgene"
contains less species, it queries a wide variety
of orthology databases and can return a column "support_n" that tells
you how many databases support each ortholog gene mapping.
This can be helpful when you need a semi-quantitative
measure of mapping quality.
It's also worth noting that for smaller gene sets, the speed difference between these methods becomes negligible.
pros_cons <- data.frame( gprofiler=c("Reference organisms"="700+", "Gene mappings"="More comprehensive", "Updates"="Frequent", "Orthology databases"=paste("Ensembl", "HomoloGene", "WormBase",sep = ", "), "Data location"="Remote", "Internet connection"="Required", "Speed"="Slower"), homologene=c("# reference organisms"="20+", "Gene mappings"="Less comprehensive", "Updates"="Less frequent", "Orthology databases"="HomoloGene", "Data location"="Local", "Internet connection"="Not required", "Speed"="Faster"), babelgene=c("# reference organisms"="19 (but cannot convert between pairs of non-human species)", "Gene mappings"="More comprehensive", "Updates"="Less frequent", "Orthology databases"="HGNC Comparison of Orthology Predictions (HCOP), which includes predictions from eggNOG, Ensembl Compara, HGNC, HomoloGene, Inparanoid, NCBI Gene Orthology, OMA, OrthoDB, OrthoMCL, Panther, PhylomeDB, TreeFam and ZFIN", "Data location"="Local", "Internet connection"="Not required", "Speed"="Medium") ) knitr::kable(pros_cons)
convert_orthologs
is very flexible with what users can supply as gene_df
,
and can take a data.frame
/data.table
/tibble
, (sparse) matrix
,
or list
/vector
containing genes.
Genes, transcripts, proteins, SNPs, or genomic ranges will be recognised in most formats (HGNC, Ensembl, RefSeq, UniProt, etc.) and can even be a mixture of different formats.
All genes will be mapped to gene symbols, unless specified otherwise with the
...
arguments (see ?orthogene::convert_orthologs
or here
for details).
A key feature of
convert_orthologs
is that it handles the issue of genes with many-to-many mappings across species.
This can occur due to evolutionary divergence, and the function of these genes
tend to be less conserved and less translatable.
Users can address this using different strategies via non121_strategy=
.
gene_df <- orthogene::convert_orthologs(gene_df = exp_mouse, gene_input = "rownames", gene_output = "rownames", input_species = "mouse", output_species = "human", non121_strategy = "drop_both_species", method = method) knitr::kable(as.matrix(head(gene_df)))
convert_orthologs
is just one of the many useful functions in orthogene
.
Please see the
documentation website
for the full vignette.
utils::sessionInfo()
gprofiler2
:
orthogene
uses this package. gprofiler2::gorth()
pulls from
many orthology mapping databases.
homologene
:
orthogene
uses this package. Provides API access to NCBI
HomoloGene database.
babelgene
: orthogene
uses this package. babelgene::orthologs()
pulls from
many orthology mapping databases.
annotationTools
:
For interspecies microarray data.
orthology
:
R package for ortholog mapping (deprecated?).
hpgltools::load_biomart_orthologs()
:
Helper function to get orthologs from biomart.
JustOrthologs
:
Ortholog inference from multi-species genomic sequences.
orthologr
:
Ortholog inference from multi-species genomic sequences.
OrthoFinder
:
Gene duplication event inference from multi-species genomics.
HomoloGene: NCBI database that the R package homologene pulls from.
gProfiler: Web server for functional enrichment analysis and conversions of gene lists.
OrtholoGene: Compiled list of gene orthology resources.
UK Dementia Research Institute
Department of Brain Sciences
Faculty of Medicine
Imperial College London
GitHub
DockerHub
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.