get_signatures-ConsensusPartition-method: Get signature rows

Description Usage Arguments Details Value Author(s) Examples

Description

Get signature rows

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## S4 method for signature 'ConsensusPartition'
get_signatures(object, k,
    silhouette_cutoff = 0.5,
    fdr_cutoff = cola_opt$fdr_cutoff,
    top_signatures = NULL,
    group_diff = cola_opt$group_diff,
    scale_rows = object@scale_rows,
    row_km = NULL,
    diff_method = c("Ftest", "ttest", "samr", "pamr", "one_vs_others"),
    anno = get_anno(object),
    anno_col = get_anno_col(object),
    internal = FALSE,
    show_row_dend = FALSE,
    show_column_names = FALSE, use_raster = TRUE,
    plot = TRUE, verbose = TRUE, seed = 888,
    left_annotation = NULL, right_annotation = NULL,
    col = if(scale_rows) c("green", "white", "red") else c("blue", "white", "red"),
    simplify = FALSE, prefix = "", enforce = FALSE,
    ...)

Arguments

object

A ConsensusPartition-class object.

k

Number of subgroups.

silhouette_cutoff

Cutoff for silhouette scores. Samples with values less than it are not used for finding signature rows. For selecting a proper silhouette cutoff, please refer to https://www.stat.berkeley.edu/~s133/Cluster2a.html#tth_tAb1.

fdr_cutoff

Cutoff for FDR of the difference test between subgroups.

top_signatures

Top signatures with most significant fdr. Note since fdr might be same for multiple rows, the final number of signatures might not be exactly the same as the one that has been set.

group_diff

Cutoff for the maximal difference between group means.

scale_rows

Whether apply row scaling when making the heatmap.

row_km

Number of groups for performing k-means clustering on rows. By default it is automatically selected.

diff_method

Methods to get rows which are significantly different between subgroups, see 'Details' section.

anno

A data frame of annotations for the original matrix columns. By default it uses the annotations specified in consensus_partition or run_all_consensus_partition_methods.

anno_col

A list of colors (color is defined as a named vector) for the annotations. If anno is a data frame, anno_col should be a named list where names correspond to the column names in anno.

internal

Used internally.

show_row_dend

Whether show row dendrogram.

show_column_names

Whether show column names in the heatmap.

use_raster

Internally used.

plot

Whether to make the plot.

verbose

Whether to print messages.

seed

Random seed.

left_annotation

Annotation put on the left of the heatmap. It should be a HeatmapAnnotation-class object. The number of items should be the same as the number of the original matrix rows. The subsetting to the significant rows are automatically performed on the annotation object.

right_annotation

Annotation put on the right of the heatmap. Same format as left_annotation.

col

Colors.

simplify

Only used internally.

prefix

Only used internally.

enforce

The analysis is cached by default, so that the analysis with the same input will be automatically extracted without rerunning them. Set enforce to TRUE to enforce the funtion to re-perform the analysis.

...

Other arguments.

Details

Basically the function applies statistical test for the difference in subgroups for every row. There are following methods which test significance of the difference:

ttest

First it looks for the subgroup with highest mean value, compare to each of the other subgroups with t-test and take the maximum p-value. Second it looks for the subgroup with lowest mean value, compare to each of the other subgroups again with t-test and take the maximum p-values. Later for these two list of p-values take the minimal p-value as the final p-value.

samr/pamr

use SAM (from samr package)/PAM (from pamr package) method to find significantly different rows between subgroups.

Ftest

use F-test to find significantly different rows between subgroups.

one_vs_others

For each subgroup i in each row, it uses t-test to compare samples in current subgroup to all other samples, denoted as p_i. The p-value for current row is selected as min(p_i).

diff_method can also be a self-defined function. The function needs two arguments which are the matrix for the analysis and the predicted classes. The function should returns a vector of FDR from the difference test.

Value

A data frame with more than two columns:

which_row:

row index corresponding to the original matrix.

fdr:

the FDR.

km:

the k-means groups if row_km is set.

other_columns:

the mean value (depending rows are scaled or not) in each subgroup.

Author(s)

Zuguang Gu <z.gu@dkfz.de>

Examples

1
2
3
4
5
data(golub_cola)
res = golub_cola["ATC", "skmeans"]
tb = get_signatures(res, k = 3)
head(tb)
get_signatures(res, k = 3, top_signatures = 100)

cola documentation built on Nov. 8, 2020, 8:12 p.m.