clusterBanksy: Perform clustering in BANKSY's neighborhood-augmented feature...

View source: R/cluster.R

clusterBanksyR Documentation

Perform clustering in BANKSY's neighborhood-augmented feature space.

Description

Perform clustering in BANKSY's neighborhood-augmented feature space.

Usage

clusterBanksy(
  se,
  use_agf = FALSE,
  lambda = 0.2,
  use_pcs = TRUE,
  npcs = 20L,
  dimred = NULL,
  ndims = NULL,
  assay_name = NULL,
  group = NULL,
  algo = c("leiden", "louvain", "kmeans", "mclust"),
  k_neighbors = 50,
  resolution = 1,
  leiden.iter = -1,
  kmeans.centers = 5,
  mclust.G = 5,
  M = NULL,
  seed = NULL,
  ...
)

Arguments

se

A SpatialExperiment, SingleCellExperiment or SummarizedExperiment object with computeBanksy ran.

use_agf

A logical vector specifying whether to use the AGF for clustering.

lambda

A numeric vector in \in [0,1] specifying a spatial weighting parameter. Larger values (e.g. 0.8) incorporate more spatial neighborhood and find spatial domains, while smaller values (e.g. 0.2) perform spatial cell-typing.

use_pcs

A logical scalar specifying whether to cluster on PCs. If FALSE, runs on the BANKSY matrix.

npcs

An integer scalar specifying the number of principal components to use if use_pcs is TRUE.

dimred

A string scalar specifying the name of an existing dimensionality reduction result to use. Will overwrite use_pcs if supplied.

ndims

An integer scalar specifying the number of dimensions to use if dimred is supplied.

assay_name

A string scalar specifying the name of the assay used in computeBanksy.

group

A string scalar specifying a grouping variable for samples in se. This is used to scale the samples in each group separately.

algo

A string scalar specifying the clustering algorithm to use; one of leiden, louvain, mclust, kmeans.

k_neighbors

An integer vector specifying number of neighbors for constructing sNN (for louvain / leiden).

resolution

A numeric vector specifying resolution used for clustering (louvain / leiden).

leiden.iter

An integer scalar specifying the number of leiden iterations. For running till convergence, set to -1 (leiden).

kmeans.centers

An integer vector specifying the number of kmeans clusters (kmeans).

mclust.G

An integer vector specifying the number of mixture components (Mclust).

M

Advanced usage. An integer vector specifying the highest azimuthal Fourier harmonic to cluster with. If specified, overwrites the use_agf argument.

seed

Random seed for clustering. If not specified, no seed is set.

...

to pass to methods

Details

This function performs clustering on the principal components computed on the BANKSY matrix, i.e., the BANKSY embedding. The PCA corresponding to the parameters use_agf and lambda must have been computed with runBanksyPCA. Clustering may also be performed directly on the BANKSY matrix with use_pcs set to FALSE (this is not recommended).

Four clustering algorithms are implemented.

  • leiden: Leiden graph-based clustering. The arguments k_neighbors and resolution should be specified.

  • louvain: Louvain graph-based clustering. The arguments k_neighbors and resolution should be specified.

  • kmeans: kmeans clustering. The argument kmeans.centers should be specified.

  • mclust: Gaussian mixture model-based clustering. The argument mclust.G should be specified.

By default, no seed is set for clustering. If a seed is specified, the same seed is used for clustering across the input parameters.

Value

A SpatialExperiment / SingleCellExperiment / SummarizedExperiment object with cluster labels in colData(se).

Examples

data(rings)
spe <- computeBanksy(rings, assay_name = "counts", M = 1, k_geom = c(15, 30))
spe <- runBanksyPCA(spe, M = 1, lambda = c(0, 0.2), npcs = 20)
spe <- clusterBanksy(spe, M = 1, lambda = c(0, 0.2), resolution = 1)


prabhakarlab/Banksy documentation built on July 31, 2024, 7:37 p.m.