callSummary | R Documentation |
One of two main functions in the chromswitch
package, this function
detects a switch in chromatin state in one or
more regions given ChIP-seq peak calls for one mark, executing the entire
algorithm from preprocessing to evaluating the clustering results,
using the summary strategy.
callSummary(query, metadata, peaks, mark, filter = FALSE,
filter_columns = summarize_columns, filter_thresholds = NULL,
summarize_columns = NULL, normalize_columns = summarize_columns,
tail = 0.005, normalize = ifelse(is.null(normalize_columns) &&
is.null(summarize_columns), FALSE, TRUE), fraction = TRUE, n = FALSE,
heatmap = FALSE, titles = NULL, outdir = NULL,
optimal_clusters = TRUE, estimate_state = FALSE, signal_col = NULL,
test_condition = NULL, BPPARAM = bpparam())
query |
GRanges list containing one or more genomic regions of interest
in which to call a switch. The output dataframe will contain one row per
region in |
metadata |
A dataframe with at least two columns: "Sample" which stores the sample IDs, "Condition", which stores the biological condition labels of the samples |
peaks |
List of GRanges objects storing peak calls for each sample, where element names correspond to sample IDs |
mark |
Character specifying the histone mark or ChIP-target, for example, "H3K4me3" |
filter |
(Optional) logical value, filter peaks based on thresholds on
peak statistics? Default: FALSE. The filter step is described in
|
filter_columns |
If |
filter_thresholds |
If |
summarize_columns |
Character vector of column names on which to compute summary statistics during feature matrix construction. These statistics become the features of the matrix. |
normalize_columns |
If |
tail |
(Optional) if |
normalize |
(Optional) logical value, normalize peak statistics
genome-wide for each sample? Default: TRUE if |
fraction |
(Optional) Logical value, during feature matrix construction, compute the fraction of the region overlapped by peaks? Default: TRUE |
n |
(Optional) Logical value, during feature matrix construction, compute the number of peaks in the region? Default: FALSE |
heatmap |
(Optional) Logical value, plot the heatmap corresponding to the hierarchical clustering result? Default: FALSE |
titles |
(Optional) if |
outdir |
(Optional) if |
optimal_clusters |
(Optional) Logical value indicate whether to cluster samples into two groups, or to find the optimal clustering solution by choosing the set of clusters which maximizes the Average Silhouette width. Default: TRUE |
estimate_state |
(Optional) Logical value indicating whether to include a column "state" in the output specifying the estimated chromatin state of a test condition. The state will be on of "ON", "OFF", or NA, where the latter results if a binary switch between the conditions is unclear. Default: FALSE. |
signal_col |
(Optional) If |
test_condition |
(Optional) If |
BPPARAM |
(Optional) instance of |
This strategy constructs a sample-by-feature matrix to use as input for hierarchical clustering by computing, for each sample, a vector of summary statistics based on that sample's peaks in the query region. The summary statistics are generally based on the enrichment statistics associated with each peak as returned by the peak calling too, which might include, for example, a p value and fold change.
Data frame with one row per region in query
. Contains the
coordinates of the region, the number of inferred clusters, the computed
cluster validity statistics, and the cluster assignment for each sample.
samples <- c("E068", "E071", "E074", "E101", "E102", "E110")
bedfiles <- system.file("extdata", paste0(samples, ".H3K4me3.bed"),
package = "chromswitch")
Conditions <- c(rep("Brain", 3), rep("Other", 3))
metadata <- data.frame(Sample = samples,
H3K4me3 = bedfiles,
Condition = Conditions,
stringsAsFactors = FALSE)
regions <- GRanges(seqnames = c("chr19", "chr19"),
ranges = IRanges(start = c(54924104, 54874318),
end = c(54929104, 54877536)))
callSummary(query = regions,
metadata = metadata,
peaks = H3K4me3,
normalize_columns = c("qValue", "pValue", "signalValue"),
mark = "H3K4me3",
summarize_columns = c("pValue", "qValue", "signalValue"),
heatmap = FALSE,
BPPARAM = BiocParallel::SerialParam())
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.