Nothing
#' Compute coassignment probabilities
#'
#' Compute coassignment probabilities for each label in a reference grouping when compared to an alternative grouping of samples.
#' This is now deprecated for \code{\link{pairwiseRand}}.
#'
#' @param ref A character vector or factor containing one set of groupings, considered to be the reference.
#' @param alt A character vector or factor containing another set of groupings, to be compared to \code{alt}.
#' @param summarize Logical scalar indicating whether the output matrix should be converted into a per-label summary.
#'
#' @return
#' If \code{summarize=FALSE}, a numeric matrix is returned with upper triangular entries filled with the coassignment probabilities for each pair of labels in \code{ref}.
#'
#' Otherwise, a \linkS4class{DataFrame} is returned with one row per label in \code{ref} containing the \code{self} and \code{other} coassignment probabilities.
#'
#' @details
#' The coassignment probability for each pair of labels in \code{ref} is the probability that a randomly chosen cell from each of the two reference labels will have the same label in \code{alt}.
#' High coassignment probabilities indicate that a particular pair of labels in \code{ref} are frequently assigned to the same label in \code{alt}, which has some implications for cluster stability.
#'
#' When \code{summarize=TRUE}, we summarize the matrix of coassignment probabilities into a set of per-label values.
#' The \dQuote{self} coassignment probability is simply the diagonal entry of the matrix, i.e., the probability that two cells from the same label in \code{ref} also have the same label in \code{alt}.
#' The \dQuote{other} coassignment probability is the maximum probability across all pairs involving that label.
#'
#' % One might consider instead reporting the 'other' probability as the probability that a randomly chosen cell in the cluster and a randomly chosen cell in any other cluster belong in the same cluster.
#' % However, this results in very small probabilities in all cases, simply because most of the other clusters are well seperated.
#' % Reporting the maximum is more useful as at least you can tell that a cluster is well-separated from _all_ other clusters if it has a low 'other' probability.
#'
#' In general, \code{ref} is well-recapitulated by \code{alt} if the diagonal entries of the matrix is much higher than the sum of the off-diagonal entries.
#' This manifests as higher values for the self probabilities compared to the other probabilities.
#'
#' Note that the coassignment probability is closely related to the Rand index-based ratios
#' broken down by cluster pair in \code{\link{pairwiseRand}} with \code{mode="ratio"} and \code{adjusted=FALSE}.
#' The off-diagonal coassignment probabilities are simply 1 minus the off-diagonal ratio,
#' while the on-diagonal values differ only by the lack of consideration of pairs of the same cell in \code{\link{pairwiseRand}}.
#'
#' @author Aaron Lun
#'
#' @seealso
#' \code{\link{bootstrapCluster}}, to compute coassignment probabilities across bootstrap replicates.
#'
#' \code{\link{pairwiseRand}}, for another way to compare different clusterings.
#'
#' @examples
#' library(scuttle)
#' sce <- mockSCE(ncells=200)
#' sce <- logNormCounts(sce)
#'
#' clust1 <- kmeans(t(logcounts(sce)),3)$cluster
#' clust2 <- kmeans(t(logcounts(sce)),5)$cluster
#'
#' coassignProb(clust1, clust2)
#' coassignProb(clust1, clust2, summarize=TRUE)
#'
#' @export
coassignProb <- function(ref, alt, summarize=FALSE) {
.Deprecated(new="bluster::pairwiseRand")
tab <- table(ref, alt)
nref <- nrow(tab)
output <- matrix(NA_real_, nref, nref,
dimnames=list(rownames(tab), rownames(tab)))
for (j1 in seq_len(nref)) {
spread1 <- tab[j1,]
spread1 <- spread1/sum(spread1)
for (j2 in seq_len(j1)) {
spread2 <- tab[j2,]
spread2 <- spread2/sum(spread2)
output[j2,j1] <- sum(spread1 * spread2)
}
}
if (!summarize) {
output
} else {
.summarize_coassign(output)
}
}
#' @importFrom Matrix forceSymmetric rowSums diag diag<-
#' @importFrom S4Vectors DataFrame
#' @importFrom DelayedArray rowMaxs
.summarize_coassign <- function(mat) {
flipped <- forceSymmetric(mat)
self <- diag(flipped)
diag(flipped) <- 0
# Define against NA coassignments when empty levels are involved.
DataFrame(self=self, other=rowMaxs(as.matrix(flipped), na.rm=TRUE),
row.names=rownames(flipped))
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.