knitr::opts_chunk$set(fig.width=8, fig.height=4, cache = TRUE) library(phylosmith) data(soil_column)
Examples used in this vignette will use the GlobalPatterns
dataset from phyloseq
.
library(phyloseq) data(GlobalPatterns)
Merges samples within a phyloseq-class
object which match on the given criteria (treatment
). Any sample_data factors that do not match will be set to NA
. otu_table
counts will be reassigned as the mean of all the samples that are merged together.
Use this with caution as replicate samples may be crucial to the experimental design and should be proven statistically to be similar enough to combine for downstream analysis.
Usage
conglomerate_samples(phyloseq_obj, treatment, subset = NULL)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class
object.
treatment
| Column name as a string
, or vector
of, in the sample_data
.
subset
| A factor within the treatment
. This will remove any samples that to not contain this factor. This can be a vector
of multiple factors to subset on.
Examples
phyloseq::sample_sums(GlobalPatterns) conglomerated <- conglomerate_samples(GlobalPatterns, treatment = 'SampleType') phyloseq::sample_sums(conglomerated)
A re-write of the phyloseq::tax_glom()
. This iteration runs faster with the implementation of data.table
.
Usage
conglomerate_taxa(phyloseq_obj, classification, hierarchical = TRUE)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
classification
| Column name as a string
in the tax_table
for the factor to conglomerate by.
hierarchical
| Whether the order of factors in the tax_table represent a decreasing hierarchy (TRUE
) or are independant (FALSE
). If FALSE
, will only return the factor given by classification
.
Examples
conglomerate_taxa(GlobalPatterns, classification = 'Phylum', hierarchical = TRUE)
Converts the otu_table
, tax_table
, and sam_data
to a 2-dimensional data.table
.
Usage
melt_phyloseq(phyloseq_obj)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
Examples
melt_phyloseq(GlobalPatterns)
Combines multiple columns from the sample-data into a single column. Doing this can make it easier to subset and look at the data on multiple factors.
Usage
merge_treatments(phyloseq_obj, ...)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object. It must contain sample_data() with information about each sample.
treatment
| Column name as a string
, or vector
of, in the sample_data
.
Examples
merge_treatments(GlobalPatterns, c('Final_Barcode', 'Barcode_truncated_plus_T'))
Arranged the phyloseq object so that the samples are listed in a given order, or sorted on metadata. This is most useful for visual inspection of the metadata, and having the samples presented in a correct order in ggplot2
figures.
Usage
set_sample_order(phyloseq_obj, treatment)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
treatment
| Column name as a string
, or vector
of, in the sample_data
.
Examples
phyloseq::sample_names(GlobalPatterns) ordered_obj <- set_sample_order(GlobalPatterns, "SampleType") phyloseq::sample_names(ordered_obj)
Set the order of the levels of a factor in the sample-data. Primarily useful for
easy formatting of the order that ggplot2
will display samples.
Useful for:
Usage
set_treatment_levels(phyloseq_obj, treatment, order)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
treatment
| Column name as a string
, or vector
of, in the sample_data
.
order
| The order of factors in treatment
column as a vector
of string
s. If assigned "numeric" will set ascending numerical order.
Examples
levels(soil_column@sam_data$Day) ordered_days <- set_treatment_levels(soil_column, 'Day', 'numeric') levels(ordered_days@sam_data$Day)
Create a new phyloseq-object containing defined taxa. Taxa names can be a
substring or entire taxa name. It will match that string
in all taxa levels
unless a specific classification
level is declared.
Useful for:
Usage
taxa_extract(phyloseq_obj, taxa_to_extract, classification = NULL)
Arguments
Call | Description
-------------------- | ---------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
taxa_to_extract
| A string
, or vector
of taxa of interest.
classification
| Column name as a string
in the tax_table
for the factor
to conglomerate by.
Examples
GlobalPatterns taxa_extract(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))
This is a robust function that is implemented in nearly every other function of this package. It uses many of the subsetting processes distributed within phyloseq
, but strives to make them a more user-friendly and combined into a one-stop function. The function works in several steps.
treatments
were specified. If so, it splits the phyloseq into separate objects for each treatment to process.frequency
(filtering out taxa seen in few samples) and then merge back to one objectsubset
is declared, remove all treatment
outside of the subset
drop_samples
is TRUE
then remove any samples that have 0 taxa observed after filtering (this is a very situational need)If frequency
is set to 0 (default), then the function removes any taxa with no abundance in any sample.
Useful for:
Usage
taxa_filter(phyloseq_obj, treatment = NULL, subset = NULL, frequency = 0, below = FALSE, drop_samples = FALSE)
Arguments
Call | Description
-------------------- | ------------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
treatment
| Column name as a string
, or vector
of, in the sample_data
.
subset
| A factor within the treatment
. This will remove any samples that to not contain this factor. This can be a vector
of multiple factors to subset on.
frequency
| The proportion of samples the taxa is found in.
below
| Does frequency define the minimum (FALSE
) or maximum (TRUE
) proportion of samples the taxa is found in.
drop_samples
| Should the function remove samples that that are empty after removing taxa filtered by frequency (TRUE
).
Examples
The soil_column
data has 19,216 OTUs listed in its taxa_table
.
GlobalPatterns
However, 228 of those taxa are not actually seen in any of the samples.
length(phyloseq::taxa_sums(GlobalPatterns)[phyloseq::taxa_sums(GlobalPatterns) == 0])
taxa_filter
with frequency = 0
will remove those taxa.
taxa_filter(GlobalPatterns, frequency = 0)
Say that we wanted to only look at taxa that are seen in 80% of the samples.
taxa_filter(GlobalPatterns, frequency = 0.80)
But if we want taxa that are seen in 80% of any 1 teatment group;
taxa_filter(GlobalPatterns, frequency = 0.80, treatment = 'SampleType')
It returns a larger number of taxa, since they need to be seen in less samples overall.
Create a new phyloseq-object ommitting the defined taxa. Taxa names can be a
substring or entire taxa name. It will match that string
in all taxa levels
unless a specific classification
level is declared.
Useful for:
Usage
taxa_prune(phyloseq_obj, taxa_to_remove, classification = NULL)
Arguments
Call | Description
-------------------- | ---------------------------------------------------------
phyloseq_obj
| A phyloseq-class object.
taxa_to_remove
| A string
, or vector
of taxa to remove.
classification
| Column name as a string
in the tax_table
for the factor
to conglomerate by.
Examples
GlobalPatterns taxa_prune(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.