mapCellsToEdges: Map cells to edges

mapCellsToEdgesR Documentation

Map cells to edges

Description

Map each cell to the closest edge on the MST, reporting also the distance to the corresponding vertices.

Usage

mapCellsToEdges(x, ...)

## S4 method for signature 'ANY'
mapCellsToEdges(x, mst, clusters, columns = NULL)

## S4 method for signature 'SummarizedExperiment'
mapCellsToEdges(x, ..., assay.type = "logcounts")

## S4 method for signature 'SingleCellExperiment'
mapCellsToEdges(
  x,
  clusters = colLabels(x, onAbsence = "error"),
  ...,
  use.dimred = NULL
)

Arguments

x

A numeric matrix of coordinates where each row represents a cell/sample and each column represents a dimension (usually a PC or another low-dimensional embedding, but features or genes can also be used).

Alternatively, a SummarizedExperiment or SingleCellExperiment object containing such a matrix in its assays, as specified by assay.type. This will be transposed prior to use.

Alternatively, for SingleCellExperiments, this matrix may be extracted from its reducedDims, based on the use.dimred specification. In this case, no transposition is performed.

...

For the generic, further arguments to pass to the specific methods.

For the SummarizedExperiment method, further arguments to pass to the ANY method.

For the SingleCellExperiment method, further arguments to pass to the SummarizedExperiment method (if use.dimred is specified) or the ANY method (otherwise).

mst

A graph object containing a MST, constructed from the same coordinate space as the values in x (e.g., same PC space, same set of features).

clusters

A factor-like object of the same length as nrow(x), specifying the cluster identity for each cell in x. This can also be NULL, see details.

columns

A character, logical or integer vector specifying the columns of x to use. If NULL, all provided columns are used by default.

assay.type

An integer or string specifying the assay to use from a SummarizedExperiment x.

use.dimred

An integer or string specifying the reduced dimensions to use from a SingleCellExperiment x.

Details

For each cluster, we consider all edges of the MST involving that cluster. Each cell of that cluster is then mapped to the closest of these edges (where proximity is defined by Euclidean distance). The identity of and distance from each ends of the edge is reported; this can be useful for downstream pseudo-time calculations or to subset cells by those lying on a particular edge.

If clusters=NULL, each cell can be mapped to any edge of the MST. This is useful if the mst was constructed from a different set of cells than those in x, allowing us to effectively project new datasets onto an existing MST. Note, however, that the new x must lie in the same coordinate space as the x used to make mst.

Some cells may simply be mapped to the edge endpoints. This manifests as values of zero for the distances from either end of the edge. For analyses focusing on a specific edge, it may be advisable to filter out such cells as their edge assignments are arbitrarily assigned and they do not contribute to any transitional process along the edge.

Value

A DataFrame with one row per row of x, containing the fields:

  • left.cluster, the cluster on one end of the edge to which the cell was assigned.

  • right.cluster, the cluster on the other end of the edge to which the cell was assigned.

  • left.distance, the distance to the cluster centroid on one end.

  • right.distance, the distance to the cluster centroid on the other end.

Note that the sum of the distances will always yield the edge length.

Author(s)

Aaron Lun

References

Ji Z and Ji H (2016). TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Res. 44, e117

See Also

createClusterMST, to generate mst.

quickPseudotime, a wrapper to quickly perform these calculations.

Examples

# Mocking up a Y-shaped trajectory.
centers <- rbind(c(0,0), c(0, -1), c(1, 1), c(-1, 1))
rownames(centers) <- seq_len(nrow(centers))
clusters <- sample(nrow(centers), 1000, replace=TRUE)
cells <- centers[clusters,]
cells <- cells + rnorm(length(cells), sd=0.5)

# Creating the MST first:
mst <- createClusterMST(cells, clusters=clusters)
plot(mst)

# Mapping cells to the MST:
mapping <- mapCellsToEdges(cells, mst, clusters=clusters)
head(mapping)

# Also works with some random extra cells:
extras <- matrix(rnorm(1000), ncol=2)
emapping <- mapCellsToEdges(extras, mst, clusters=NULL)
head(emapping)


LTLA/TSCAN documentation built on Aug. 16, 2024, 12:40 p.m.