In y1zhou/brendaDb: The BRENDA Enzyme Database

knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)

Overview

r Biocpkg("brendaDb") aims to make importing and analyzing data from the BRENDA database easier. The main functions include:

Read text file downloaded from BRENDA into an R tibble
Retrieve information for specific enzymes
Query enzymes using their synonyms, gene symbols, etc.
Query enzyme information for specific BioCyc pathways

For bug reports or feature requests, please go to the GitHub repository.

Installation

r Biocpkg("brendaDb") is a Bioconductor package and can be installed through BiocManager::install().

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("brendaDb", dependencies=TRUE)

Alternatively, install the development version from GitHub.

if(!requireNamespace("brendaDb")) {
  devtools::install_github("y1zhou/brendaDb")
}

After the package is installed, it can be loaded into the R workspace by

library(brendaDb)

Getting Started

Downloading the BRENDA Text File

Download the BRENDA database as a text file here. Alternatively, download the file in R (file updated 2019-04-24):

brenda.filepath <- DownloadBrenda()
#> Please read the license agreement in the link below.
#>
#> https://www.brenda-enzymes.org/download_brenda_without_registration.php
#>
#> Found zip file in cache.
#> Extracting zip file...

The function downloads the file to a local cache directory. Now the text file can be loaded into R as a tibble:

df <- ReadBrenda(brenda.filepath)
#> Reading BRENDA text file...
#> Converting text into a list. This might take a while...
#> Converting list to tibble and removing duplicated entries...
#> If you're going to use this data again, consider saving this table using data.table::fwrite().

As suggested in the function output, you may save the df object to a text file using data.table::fwrite() or to an R object using save(df), and load the table using data.table::fread() or load()^[This requires the R package r CRANpkg("data.table") to be installed.]. Both methods should be much faster than reading the raw text file again using ReadBrenda().

Making Queries

Since BRENDA is a database for enzymes, all final queries are based on EC numbers.

Query for Multiple Enzymes

If you already have a list of EC numbers in mind, you may call QueryBrenda directly:

brenda_txt <- system.file("extdata", "brenda_download_test.txt",
                          package = "brendaDb")
df <- ReadBrenda(brenda_txt)
res <- QueryBrenda(df, EC = c("1.1.1.1", "6.3.5.8"), n.core = 2)

res

res[["1.1.1.1"]]

Query Specific Fields

You can also query for certain fields to reduce the size of the returned object.

ShowFields(df)

res <- QueryBrenda(df, EC = "1.1.1.1", fields = c("PROTEIN", "SUBSTRATE_PRODUCT"))
res[["1.1.1.1"]][["interactions"]][["substrate.product"]]

It should be noted that most fields contain a fieldInfo column and a commentary column. The fieldInfo column is what's extracted by BRENDA from the literature, and the commentary column is usually some context from the original paper. # symbols in the commentary correspond to the proteinIDs, and <> enclose the corresponding refIDs. For further information, please see the README file from BRENDA.

Query Specific Organisms

Note the difference in row numbers in the following example and in the one where we queried for all organisms.

res <- QueryBrenda(df, EC = "1.1.1.1", organisms = "Homo sapiens")
res$`1.1.1.1`

Extract Information in Query Results

To transform the brenda.entries structure into a table, use the helper function ExtractField().

res <- QueryBrenda(df, EC = c("1.1.1.1", "6.3.5.8"), n.core = 2)
ExtractField(res, field = "parameters$ph.optimum")

As shown above, the returned table consists of three parts: the EC number, organism-related information (organism, protein ID, uniprot ID, and commentary on the organism), and extracted field information (description, commentary, etc.).

Foreign ID Retrieval

Querying Synonyms

A lot of the times we have a list of gene symbols or enzyme names instead of EC numbers. In this case, a helper function can be used to find the corresponding EC numbers:

ID2Enzyme(brenda = df, ids = c("ADH4", "CD38", "pyruvate dehydrogenase"))

The EC column can be then handpicked and used in QueryBrenda().

BioCyc Pathways

Often we are interested in the enzymes involved in a specific BioCyc pathway. As BioCyc now requires login credentials for using their web service, users are recommended to use the metabolike package for more advanced queries.

Additional Information {.unnumbered}

By default QueryBrenda uses all available cores, but often limiting n.core could give better performance as it reduces the overhead. The following are results produced on a machine with 40 cores (2 Intel Xeon CPU E5-2640 v4 @ 3.4GHz), and 256G of RAM:

EC.numbers <- head(unique(df$ID), 100)
system.time(QueryBrenda(df, EC = EC.numbers, n.core = 0))  # default
#  user  system elapsed
# 4.528   7.856  34.567
system.time(QueryBrenda(df, EC = EC.numbers, n.core = 1))
#  user  system elapsed
# 22.080   0.360  22.438
system.time(QueryBrenda(df, EC = EC.numbers, n.core = 2))
#  user  system elapsed
# 0.552   0.400  13.597
system.time(QueryBrenda(df, EC = EC.numbers, n.core = 4))
#  user  system elapsed
# 0.688   0.832   9.517
system.time(QueryBrenda(df, EC = EC.numbers, n.core = 8))
#  user  system elapsed
# 1.112   1.476  10.000

sessionInfo()

y1zhou/brendaDb documentation built on Dec. 12, 2022, 3:43 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

y1zhou/brendaDb
The BRENDA Enzyme Database

In y1zhou/brendaDb: The BRENDA Enzyme Database

Overview

Installation

Getting Started

Downloading the BRENDA Text File

Making Queries

Query for Multiple Enzymes

Query Specific Fields

Query Specific Organisms

Extract Information in Query Results

Foreign ID Retrieval

Querying Synonyms

BioCyc Pathways

Additional Information {.unnumbered}

R Package Documentation

Browse R Packages

We want your feedback!

y1zhou/brendaDb The BRENDA Enzyme Database

In y1zhou/brendaDb: The BRENDA Enzyme Database

Overview

Installation

Getting Started

Downloading the BRENDA Text File

Making Queries

Query for Multiple Enzymes

Query Specific Fields

Query Specific Organisms

Extract Information in Query Results

Foreign ID Retrieval

Querying Synonyms

BioCyc Pathways

Additional Information {.unnumbered}

R Package Documentation

Browse R Packages

We want your feedback!

y1zhou/brendaDb
The BRENDA Enzyme Database