reduce_dimensions-methods: Dimension reduction of the transcript abundance data
In stemangiola/tidyBulk: Brings transcriptomics to the tidyverse

reduce_dimensions

R Documentation

Dimension reduction of the transcript abundance data

Description

reduce_dimensions() takes as input A 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and calculates the reduced dimensional space of the transcript abundance.

Usage

reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

## S4 method for signature 'spec_tbl_df'
reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

## S4 method for signature 'tbl_df'
reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

## S4 method for signature 'tidybulk'
reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

## S4 method for signature 'SummarizedExperiment'
reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

## S4 method for signature 'RangedSummarizedExperiment'
reduce_dimensions(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  .dims = 2,
  top = 500,
  of_samples = TRUE,
  transform = log1p,
  scale = TRUE,
  action = "add",
  ...,
  log_transform = NULL
)

Arguments

`.data`	A 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
`.element`	The name of the element column (normally samples).
`.feature`	The name of the feature column (normally transcripts/genes)
`.abundance`	The name of the column including the numerical value the clustering is based on (normally transcript abundance)
`method`	A character string. The dimension reduction algorithm to use (PCA, MDS, tSNE).
`.dims`	An integer. The number of dimensions your are interested in (e.g., 4 for returning the first four principal components).
`top`	An integer. How many top genes to select for dimensionality reduction
`of_samples`	A boolean. In case the input is a tidybulk object, it indicates Whether the element column will be sample or transcript column
`transform`	A function that will tranform the counts, by default it is log1p for RNA sequencing data, but for avoinding tranformation you can use identity
`scale`	A boolean for method="PCA", this will be passed to the 'prcomp' function. It is not included in the ... argument because although the default for 'prcomp' if FALSE, it is advisable to set it as TRUE.
`action`	A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).
`...`	Further parameters passed to the function prcomp if you choose method="PCA" or Rtsne if you choose method="tSNE", or uwot::tumap if you choose method="umap"
`log_transform`	DEPRECATED - A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)

Details

'r lifecycle::badge("maturing")'

This function reduces the dimensions of the transcript abundances. It can use multi-dimensional scaling (MDS; DOI.org/10.1186/gb-2010-11-3-r25), principal component analysis (PCA), or tSNE (Jesse Krijthe et al. 2018)

Underlying method for PCA: prcomp(scale = scale, ...)

Underlying method for MDS: limma::plotMDS(ndim = .dims, plot = FALSE, top = top)

Underlying method for tSNE: Rtsne::Rtsne(data, ...)

Underlying method for UMAP:

df_source = .data |>

# Filter NA symbol filter(!!.feature |> is.na() |> not()) |>

# Prepare data frame distinct(!!.feature,!!.element,!!.abundance) |>

# Filter most variable genes keep_variable_transcripts(top) |> reduce_dimensions(method="PCA", .dims = calculate_for_pca_dimensions, action="get" ) |> as_matrix(rownames = quo_name(.element)) |> uwot::tumap(...)

Value

A tbl object with additional columns for the reduced dimensions

A 'SummarizedExperiment' object

Examples




counts.MDS =
 tidybulk::se_mini |>
 identify_abundant() |>
 reduce_dimensions( method="MDS", .dims = 3)


counts.PCA =
 tidybulk::se_mini |>
 identify_abundant() |>
 reduce_dimensions(method="PCA", .dims = 3)

stemangiola/tidyBulk documentation built on Dec. 16, 2024, 7:22 p.m.