Description DataMap Objects Results Objects Gene Categories S3 Methods
In order to assess the quality of a set of (predicted) genes for a genome, evidence must first be mapped to that genome.
Next, each gene must be categorized based on how strong the evidence is for that gene or against that gene. Class Assessment
furnishes objects that can store the necessary information for assessing a set of genes for a genome and also provides
functions for viewing and visualizing assessment information. Specifically, class Assessment
objects utilize proteomic hits
and evolutionarily conserved start & stop codons as evidence to determine the correctness for each gene in a given set.
DataMap
ObjectsObjects of class Assessment
and subclass DataMap
are used to store the mapping of proteomics and evolutionary
conservation to the genome of interest (central genome). They are generated through the function MapAssessmentData
,
and they have a list structure containing the following elements:
StrainID
Equal to strainID
if it was specified; otherwise ""
Species
Equal to speciesName
if it was specified; otherwise ""
GenomeLength
Length of the central genome
StopsByFrame
Where the stops are in each frame, used to bound open reading frames in downstream functions
N-TermProteomics
Logical describing whether or not the proteomics hits are from N-terminal proteomics
FwdProtHits
Proteomic hit information that maps to the three forward frames of the central genome
RevProtHits
Proteomic hit information that maps to the three reverse frames of the central genome
FwdCoverage
Coverage of the forward strand of the central genome
FwdConStarts
Start codon conservation of the forward strand of the central genome
FwdConStops
Stop codon conservation of the forward strand of the central genome
RevCoverage
Coverage of the reverse strand of the central genome
RevConStarts
Start codon conservation of the reverse strand of the central genome
RevConStops
Stop codon conservation of the reverse strand of the central genome
NumRelatedGenomes
Final number of related genomes that were mapped to the central genome
HasProteomics
Logical describing whether or not proteomics evidence has been mapped to the central genome
HasConservation
Logical describing whether or not evolutionary conservation evidence has been mapped to the central genome
Results
ObjectsObjects of class Assessment
and subclass Results
are used to store how correct a set of genes for a given genome.
The function AssessGenes
generates Results
using a DataMap
object and information on a set of genes
for the genome corresponding to the DataMap
object. Results
objects have a list structure containing the following
elements:
StrainID
Equal to the strainID
of the corresponding DataMap
object
Species
Equal to speciesName
of the corresponding DataMap
object
GenomeLength
Length of the genome
GeneLeftPos
Left positions of the given set of genes (in forward strand terms)
GeneRightPos
Right positions of the given set of genes (in forward strand terms)
GeneStrand
Strand information of the given set of genes ("+" or "-")
GeneSource
The source of the given set of genes
NumGenes
Number of genes given
N_CS-_PE+_ORFs
Data for open reading frames with no gene start but with proteomics evidence
N_CS<_PE+_ORFs
Data for open reading frames with no gene start but with proteomics evidence and at least one valid evolutionarily conserved start
CategoryAssignments
A character vector that stores the category assignment for each of the given genes in the same order as the gene information (please see below for a list of all possible categories, their descriptions, and their character string codes)
The CategoryAssignments
vector in Results
objects describes how the proteomics evidence and evolutionarily
conserved start/stop codon evidence support or disprove the corresponding set of genes. In the vector, each gene is assigned
a character string code that has the following format: "Y CS[_] PE[_]". The first part, the "Y", signifies that for this ORF
contains a predicted gene. The second part, the "CS[_]", describes how the conserved start(s) lines up with the given gene
start. The third part, the "PE[_]", describes how the proteomics hits line up with the given gene start.
Y CS+ PE+
There is a good conserved start aligned with the gene start with protein evidence downstream.
Y CS+ PE-
There is a good conserved start aligned with the gene start without protein evidence downstream.
Y CS- PE+
There is no good conserved start aligned with the predicted start, and there is protein evidence downstream of the gene start.
Y CS- PE-
There is no good conserved start aligned with the predicted start, and there is no protein evidence downstream of the gene start.
Y CS! PE-
There are either multiple good conserved stops in the middle of the gene, or the most downstream, good conserved stop is followed by a good conserved start. There is no protein evidence downstream of the gene start
Y CS! PE+
The most downstream, good conserved stop is followed by a good conserved start, and there is protein evidence downstream of the gene start.
Y CS< PE!
The protein evidence disagrees with/is upstream of the gene start, and there is a good conserved start upstream of the protein evidence.
Y CS- PE!
The protein evidence disagrees with/is upstream of the gene start, and there is no good conserved start upstream of the protein evidence.
Y CS> PE+
The best conserved starts are downstream of the predicted start, and there is protein evidence downstream of the gene start.
Y CS> PE-
The best conserved starts are downstream of the predicted start, and there is no protein evidence downstream of the gene start.
Y CS< PE+
At least one of the best conserved starts is upstream of the predicted start, and there is protein evidence downstream of the gene start.
Y CS< PE-
At least one of the best conserved starts is upstream of the predicted start, and there is no protein evidence downstream of the gene start.
as.matrix.Assessment
(only works with objects of class Results
)
print.Assessment
plot.Assessment
mosaicplot.Assessment
(only works with objects of class Results
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.