plotClusters: Visualize cluster assignments across multiple clusterings

plotClustersR Documentation

Visualize cluster assignments across multiple clusterings

Description

Align multiple clusterings of the same set of samples and provide a color-coded plot of their shared cluster assignments

Usage

## S4 method for signature 'ClusterExperiment'
plotClusters(
  object,
  whichClusters,
  existingColors = c("ignore", "all", "firstOnly"),
  resetNames = FALSE,
  resetColors = FALSE,
  resetOrderSamples = FALSE,
  colData = NULL,
  clusterLabels = NULL,
  ...
)

## S4 method for signature 'matrix'
plotClusters(
  object,
  orderSamples = NULL,
  colData = NULL,
  reuseColors = FALSE,
  matchToTop = FALSE,
  plot = TRUE,
  unassignedColor = "white",
  missingColor = "grey",
  minRequireColor = 0.3,
  startNewColors = FALSE,
  colPalette = massivePalette,
  input = c("clusters", "colors"),
  clusterLabels = colnames(object),
  add = FALSE,
  xCoord = NULL,
  ylim = NULL,
  tick = FALSE,
  ylab = "",
  xlab = "",
  axisLine = 0,
  box = FALSE,
  ...
)

Arguments

object

A matrix of with each column corresponding to a clustering and each row a sample or a ClusterExperiment object. If a matrix, the function will plot the clusterings in order of this matrix, and their order influences the plot greatly.

whichClusters

argument that can be either numeric or character vector indicating the clusterings to be used. See details of getClusterIndex.

existingColors

how to make use of the exiting colors in the ClusterExperiment object. 'ignore' will ignore them and assign new colors. 'firstOnly' will use the existing colors of only the 1st clustering, and then align the remaining clusters and give new colors for the remaining only. 'all' will use all of the existing colors.

resetNames

logical. Whether to reset the names of the clusters in clusterLegend to be the aligned integer-valued ids from plotClusters.

resetColors

logical. Whether to reset the colors in clusterLegend in the ClusterExperiment returned to be the colors from the plotClusters.

resetOrderSamples

logical. Whether to replace the existing orderSamples slot in the ClusterExperiment object with the new order found.

colData

If clusters is a matrix, colData gives a matrix of additional cluster/sample data on the samples to be plotted with the clusterings given in clusters. Values in colData will be added to the end (bottom) of the plot. NAs in the colData matrix will trigger an error. If clusters is a ClusterExperiment object, the input in colData refers to columns of the the colData slot of the ClusterExperiment object to be plotted with the clusters. In this case, colData can be TRUE (i.e. all columns will be plotted), or an index or a character vector that references a column or column name, respectively, of the colData slot of the ClusterExperiment object. If there are NAs in the colData columns, they will be encoded as 'unassigned' and receive the same color as 'unassigned' samples in the clustering.

clusterLabels

names to go with the columns (clusterings) in matrix colorMat. If colData argument is not NULL, the clusterLabels argument must include names for the sample data too. If the user gives only names for the clusterings, the code will try to anticipate that and use the column names of the sample data, but this is fragile. If set to FALSE, then no labels will be plotted.

...

for plotClusters arguments passed either to the method of plotClusters for matrices, or ultimately to plot (if add=FALSE).

orderSamples

A predefined order in which the samples will be plotted. Otherwise the order will be found internally by aligning the clusters (assuming input="clusters")

reuseColors

Logical. Whether each row should consist of the same set of colors. By default (FALSE) each cluster that the algorithm doesn't identify to the previous rows clusters gets a new color.

matchToTop

Logical as to whether all clusters should be aligned to the first row. By default (FALSE) each cluster is aligned to the ordered clusters of the row above it.

plot

Logical as to whether a plot should be produced.

unassignedColor

If “-1” in clusters, will be given this color (meant for samples not assigned to cluster).

missingColor

If “-2” in clusters, will be given this color (meant for samples that were missing from the clustering, mainly when comparing clusterings run on different sets of samples)

minRequireColor

In aligning colors between rows of clusters, require this percent overlap.

startNewColors

logical, indicating whether in aligning colors between rows of clusters, should the colors restart at beginning of colPalette as long as colors are not in immediately proceeding row (the colors at the end of massivePalette are all of colors() and many will be indistinguishable, so this option can be useful if you have a large cluster matrix).

colPalette

a vector of colors used for the different clusters. Must be as long as the maximum number of clusters found in any single clustering/column given in clusters or will otherwise return an error.

input

indicate whether the input matrix is matrix of integer assigned clusters, or contains the colors. If input="colors", then the object clusters is a matrix of colors and there is no alignment (this option allows the user to manually adjust the colors and replot, for example).

add

whether to add to existing plot.

xCoord

values on x-axis at which to plot the rows (samples).

ylim

vector of limits of y-axis.

tick

logical, whether to draw ticks on x-axis for each sample.

ylab

character string for the label of y-axis.

xlab

character string for the label of x-axis.

axisLine

the number of lines in the axis labels on y-axis should be (passed to line = ... in the axis call).

box

logical, whether to draw box around the plot.

Details

All arguments of the matrix version can be passed to the ClusterExperiment version. As noted above, however, some arguments have different interpretations.

If whichClusters = "workflow", then the workflow clusterings will be plotted in the following order: final, mergeClusters, makeConsensus, clusterMany.

Value

If clusters is a ClusterExperiment Object, then plotClusters returns an updated ClusterExperiment object, where the clusterLegend and/or orderSamples slots have been updated (depending on the arguments).

If clusters is a matrix, plotClusters returns (invisibly) the orders and other things that go into making the matrix. Specifically, a list with the following elements.

  • orderSamples a vector of length equal to nrows(clusters) giving the order of the samples (rows) to use to get the original clusters matrix into the order made by plotClusters.

  • colors matrix of color assignments for each element of original clusters matrix. Matrix is in the same order as original clusters matrix. The matrix colors[orderSamples,] is the matrix that can be given back to plotClusters to recreate the plot (see examples).

  • alignedClusterIds a matrix of integer valued cluster assignments that match the colors. This is useful if you want to have cluster identification numbers that are better aligned than that given in the original clusters. Again, the rows/samples are in same order as original input matrix.

  • clusterLegend list of length equal to the number of columns of input matrix. The elements of the list are matrices, each with three columns named "Original","Aligned", and "Color" giving, respectively, the correspondence between the original cluster ids in clusters, the aligned cluster ids in aligned, and the color.

  • origClustersThe original matrix of clusters given to plotClusters

Author(s)

Elizabeth Purdom and Marla Johnson (based on the tracking plot in ConsensusClusterPlus by Matt Wilkerson and Peter Waltman).

References

Wilkerson, D. M, Hayes and Neil D (2010). "ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking." Bioinformatics, 26(12), pp. 1572-1573.

See Also

The ConsensusClusterPlus package.

Examples

## Not run: 
#clustering using pam: try using different dimensions of pca and different k
data(simData)

cl <- clusterMany(simData, nReducedDims=c(5, 10, 50), reduceMethod="PCA",
clusterFunction="pam", ks=2:4, findBestK=c(TRUE,FALSE),
removeSil=c(TRUE,FALSE), makeMissingDiss=TRUE)

clusterLabels(cl)

#make names shorter for better plotting
x <- clusterLabels(cl)
x <- gsub("TRUE", "T", x)
x <- gsub("FALSE", "F", x)
x <- gsub("k=NA,", "", x)
x <- gsub("Features", "", x)
clusterLabels(cl) <- x

par(mar=c(2,10,1,1))
#this will make the choices of plotClusters
cl <- plotClusters(cl, axisLine=-1, resetOrderSamples=TRUE, resetColors=TRUE)

#see the new cluster colors
clusterLegend(cl)[1:2]

#We can also change the order of the clusterings. Notice how this
#dramatically changes the plot!
clOrder <- c(3:6, 1:2, 7:ncol(clusterMatrix(cl)))
cl <- plotClusters(cl, whichClusters=clOrder, resetColors=TRUE,
resetOrder=TRUE, axisLine=-2)

#We can manually switch the red ("#E31A1C") and green ("#33A02C") in the
#first cluster:

#see what the default colors are and their names
showPalette(wh=1:5)

#change "#E31A1C" to "#33A02C"
newColorMat <- clusterLegend(cl)[[clOrder[1]]]
newColorMat[2:3, "color"] <- c("#33A02C", "#E31A1C")
clusterLegend(cl)[[clOrder[1]]]<-newColorMat

#replot by setting 'input="colors"'
par(mfrow=c(1,2))
plotClusters(cl, whichClusters=clOrder, orderSamples=orderSamples(cl),
existingColors="all")
plotClusters(cl, whichClusters=clOrder, resetColors=TRUE, resetOrder=TRUE,
axisLine=-2)
par(mfrow=c(1,1))

#set some of clusterings arbitrarily to "-1", meaning not clustered (white),
#and "-2" (another possible designation getting gray, usually for samples not
#included in original clustering)
clMatNew <- apply(clusterMatrix(cl), 2, function(x) {
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -1
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -2
return(x)
})

#make a new object
cl2 <- ClusterExperiment(assay(cl), clMatNew,
transformation=transformation(cl))
plotClusters(cl2)

## End(Not run)

epurdom/clusterExperiment documentation built on April 28, 2024, 8:17 p.m.