View source: R/functions_clusteringKmeans.R
within_clust_sort | R Documentation |
Without modifying cluster assignments, modify the order of rows within each cluster based on within_order_strategy.
within_clust_sort(
clust_dt,
row_ = "id",
column_ = "x",
fill_ = "y",
facet_ = "sample",
cluster_ = "cluster_id",
within_order_strategy = c("hclust", "sort", "left", "right", "none", "reverse")[2],
clustering_col_min = -Inf,
clustering_col_max = Inf,
dcast_fill = NA
)
clust_dt |
data.table output from |
row_ |
variable name mapped to row, likely id or gene name for ngs data. Default is "id" and works with ssvFetch* output. |
column_ |
varaible mapped to column, likely bp position for ngs data. Default is "x" and works with ssvFetch* output. |
fill_ |
numeric variable to map to fill. Default is "y" and works with ssvFetch* output. |
facet_ |
variable name to facet horizontally by. Default is "sample" and works with ssvFetch* output. Set to "" if data is not facetted. |
cluster_ |
variable name to use for cluster info. Default is "cluster_id". |
within_order_strategy |
one of "hclust", "sort", "right", "left", "reverse". If "hclust", hierarchical clustering will be used. If "sort", a simple decreasing sort of rosSums. If "left", will atttempt to put high signal on left ("right" is opposite). If "reverse" reverses existing order (should only be used after meaningful order imposed). |
clustering_col_min |
numeric minimum for col range considered when clustering, default in -Inf |
clustering_col_max |
numeric maximum for col range considered when clustering, default in Inf |
dcast_fill |
value to supply to dcast fill argument. default is NA. |
This is particularly useful when you want to sort within each cluster by a different variable from cluster assignment. Also if you've imported cluster assigments but want to sort within each for the new data for a prettier heatmap.
TODO refactor shared code with clusteringKmeansNestedHclust
data.table matching input clust_dt save for the reassignment of levels of row_ variable.
data(CTCF_in_10a_profiles_dt)
#clustering by relative value per region does a good job highlighting changes
#when then plotting raw values the order within clusters is not smooth
#this is a good situation to apply a separate sort within clusters.
prof_dt = CTCF_in_10a_profiles_dt
prof_dt = append_ynorm(prof_dt)
prof_dt[, y_relative := y_norm / max(y_norm), list(id)]
clust_dt = ssvSignalClustering(prof_dt, fill_ = "y_relative")
clust_dt.sort = within_clust_sort(clust_dt)
cowplot::plot_grid(
ssvSignalHeatmap(clust_dt) +
labs(title = "clustered by relative, sorted by relative"),
ssvSignalHeatmap(clust_dt.sort) +
labs(title = "clustered by relative, sorted by raw value")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.