In schuyler-smith/phyloschuyler: Functions to help analyze data as phyloseq objects

knitr::opts_chunk$set(fig.width=8, fig.height=4, cache = TRUE)
library(phylosmith)
data(soil_column)

Examples used in this vignette will use the GlobalPatterns dataset from phyloseq.

library(phyloseq)
data(GlobalPatterns)

conglomerate_samples

Merges samples within a phyloseq-class object which match on the given criteria (treatment). Any sample_data factors that do not match will be set to NA. otu_table counts will be reassigned as the mean of all the samples that are merged together.

Use this with caution as replicate samples may be crucial to the experimental design and should be proven statistically to be similar enough to combine for downstream analysis.

Usage

conglomerate_samples(phyloseq_obj, treatment, subset = NULL)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. subset | A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on.

Examples

phyloseq::sample_sums(GlobalPatterns)
conglomerated <- conglomerate_samples(GlobalPatterns, treatment = 'SampleType')
phyloseq::sample_sums(conglomerated)

conglomerate_taxa

A re-write of the phyloseq::tax_glom(). This iteration runs faster with the implementation of data.table.

Usage

conglomerate_taxa(phyloseq_obj, classification, hierarchical = TRUE)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. classification | Column name as a string in the tax_table for the factor to conglomerate by. hierarchical | Whether the order of factors in the tax_table represent a decreasing hierarchy (TRUE) or are independant (FALSE). If FALSE, will only return the factor given by classification.

Examples

conglomerate_taxa(GlobalPatterns, classification = 'Phylum', hierarchical = TRUE)

melt_phyloseq

Converts the otu_table, tax_table, and sam_data to a 2-dimensional data.table.

Usage

melt_phyloseq(phyloseq_obj)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object.

Examples

melt_phyloseq(GlobalPatterns)

merge_treatments

Combines multiple columns from the sample-data into a single column. Doing this can make it easier to subset and look at the data on multiple factors.

Usage

merge_treatments(phyloseq_obj, ...)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. It must contain sample_data() with information about each sample. treatment | Column name as a string, or vector of, in the sample_data.

Examples

merge_treatments(GlobalPatterns, c('Final_Barcode', 'Barcode_truncated_plus_T'))

set_sample_order

Arranged the phyloseq object so that the samples are listed in a given order, or sorted on metadata. This is most useful for visual inspection of the metadata, and having the samples presented in a correct order in ggplot2 figures.

Usage

set_sample_order(phyloseq_obj, treatment)

Arguments

Examples

phyloseq::sample_names(GlobalPatterns)
ordered_obj <- set_sample_order(GlobalPatterns, "SampleType")
phyloseq::sample_names(ordered_obj)

set_treatment_levels

Set the order of the levels of a factor in the sample-data. Primarily useful for easy formatting of the order that ggplot2 will display samples.

Useful for:

managing order which variables appear in figures

Usage

set_treatment_levels(phyloseq_obj, treatment, order)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. order | The order of factors in treatment column as a vector of strings. If assigned "numeric" will set ascending numerical order.

Examples

levels(soil_column@sam_data$Day)
ordered_days <- set_treatment_levels(soil_column, 'Day', 'numeric')
levels(ordered_days@sam_data$Day)

taxa_extract

Create a new phyloseq-object containing defined taxa. Taxa names can be a substring or entire taxa name. It will match that string in all taxa levels unless a specific classification level is declared.

Useful for:

looking at specific taxa of interest

Usage

taxa_extract(phyloseq_obj, taxa_to_extract, classification = NULL)

Arguments

Call | Description -------------------- | --------------------------------------------------------- phyloseq_obj | A phyloseq-class object. taxa_to_extract | A string, or vector of taxa of interest. classification | Column name as a string in the tax_table for the factor to conglomerate by.

Examples

GlobalPatterns
taxa_extract(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))

taxa_filter

This is a robust function that is implemented in nearly every other function of this package. It uses many of the subsetting processes distributed within phyloseq, but strives to make them a more user-friendly and combined into a one-stop function. The function works in several steps.

Checks to see if treatments were specified. If so, it splits the phyloseq into separate objects for each treatment to process.
Check to see which taxa are seen in a proportion of samples across each phyloseq object > frequency (filtering out taxa seen in few samples) and then merge back to one object
If subset is declared, remove all treatment outside of the subset
If drop_samples is TRUE then remove any samples that have 0 taxa observed after filtering (this is a very situational need)

If frequency is set to 0 (default), then the function removes any taxa with no abundance in any sample.

Useful for:

subsetting by sample_data factors
removing low-presence taxa
removing high-presence taxa

Usage

taxa_filter(phyloseq_obj, treatment = NULL, subset = NULL, frequency = 0, below = FALSE, drop_samples = FALSE)

Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. subset | A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on. frequency | The proportion of samples the taxa is found in. below | Does frequency define the minimum (FALSE) or maximum (TRUE) proportion of samples the taxa is found in. drop_samples | Should the function remove samples that that are empty after removing taxa filtered by frequency (TRUE).

Examples The soil_column data has 19,216 OTUs listed in its taxa_table.

GlobalPatterns

However, 228 of those taxa are not actually seen in any of the samples.

length(phyloseq::taxa_sums(GlobalPatterns)[phyloseq::taxa_sums(GlobalPatterns) == 0])

taxa_filter with frequency = 0 will remove those taxa.

taxa_filter(GlobalPatterns, frequency = 0)

Say that we wanted to only look at taxa that are seen in 80% of the samples.

taxa_filter(GlobalPatterns, frequency = 0.80)

But if we want taxa that are seen in 80% of any 1 teatment group;

taxa_filter(GlobalPatterns, frequency = 0.80, treatment = 'SampleType')

It returns a larger number of taxa, since they need to be seen in less samples overall.

taxa_prune

Create a new phyloseq-object ommitting the defined taxa. Taxa names can be a substring or entire taxa name. It will match that string in all taxa levels unless a specific classification level is declared.

Useful for:

removing specific taxa that are not of interest

Usage

taxa_prune(phyloseq_obj, taxa_to_remove, classification = NULL)

Arguments

Call | Description -------------------- | --------------------------------------------------------- phyloseq_obj | A phyloseq-class object. taxa_to_remove | A string, or vector of taxa to remove. classification | Column name as a string in the tax_table for the factor to conglomerate by.

Examples

GlobalPatterns
taxa_prune(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))

schuyler-smith/phyloschuyler documentation built on Aug. 16, 2024, 5:36 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

schuyler-smith/phyloschuyler
Functions to help analyze data as phyloseq objects

In schuyler-smith/phyloschuyler: Functions to help analyze data as phyloseq objects

conglomerate_samples

conglomerate_taxa

melt_phyloseq

merge_treatments

set_sample_order

set_treatment_levels

taxa_extract

taxa_filter

taxa_prune

R Package Documentation

Browse R Packages

We want your feedback!

schuyler-smith/phyloschuyler Functions to help analyze data as phyloseq objects

In schuyler-smith/phyloschuyler: Functions to help analyze data as phyloseq objects

conglomerate_samples

conglomerate_taxa

melt_phyloseq

merge_treatments

set_sample_order

set_treatment_levels

taxa_extract

taxa_filter

taxa_prune

R Package Documentation

Browse R Packages

We want your feedback!

schuyler-smith/phyloschuyler
Functions to help analyze data as phyloseq objects