classification: Decoding and Encoding Phylogenetic Classification Annotations

classificationR Documentation

Decoding and Encoding Phylogenetic Classification Annotations

Description

Functions to represent, decode and encode phylogenetic classification annotations used in FASTA files by RDP and the Greengenes project.

Usage

decode_Greengenes(annotation)

GenClass16S(
  Kingdom = NA,
  Phylum = NA,
  Class = NA,
  Order = NA,
  Family = NA,
  Genus = NA,
  Species = NA,
  Otu = NA,
  Org_name = NA,
  Id = NA
)

encode_Greengenes(classification)

decode_RDP(annotation)

encode_RDP(classification)

Arguments

annotation

Annotation from a FASTA file containing the classification information.

Kingdom

Name of the kingdom to which the organism belongs.

Phylum

Name of the phylum to which the organism belongs.

Class

Name of the class to which the organism belongs.

Order

Name of the order to which the organism belongs.

Family

Name of the family to which the organism belongs.

Genus

Name of the genus to which the organism belongs.

Species

Name of the species to which the organism belongs.

Otu

Name of the otu to which the organism belongs.

Org_name

Name of the organism.

Id

ID of the sequence.

classification

A data.frame created with GenClass16S() with the classification information.

Value

GenClass16S() and decodeX() return a data.frame. encodeX() returns a string with the corresponding annotation.

Examples


seq <- readRNAStringSet(system.file("examples/RNA_example.fasta",
    package = "rRDP"
))

### the FASTA annotation is read as names. This data has a Greengenes format
### annotation
names(seq)

classification <- decode_Greengenes(names(seq))
classification

### look at the Genus of all sequences
classification[, "Genus"]

### to train the RDP classifier, the annotations need to be in RDP format
annotation <- encode_RDP(classification)
names(seq) <- annotation
seq

### now we can train the classifier
customRDP <- trainRDP(seq)
customRDP

## clean up
removeRDP(customRDP)

mhahsler/rRDP documentation built on April 29, 2024, 9:11 a.m.