findBumps | R Documentation |
This function constructs transcriptome m6A bumps for each input \& IP replicate, by merging together bins having significant enrichment of IP over input control reads.
findBumps(chr, pos, strand, x, count, use = "pval", pval.cutoff, fdr.cutoff, lfc.cutoff, sep = 2000, minlen = 100, minCount = 3, dis.merge = 100, scorefun = mean, sort = TRUE)
chr |
Chromosome number of all bins. |
pos |
Transcriptome start position of all bins. |
strand |
Strand of all bins. |
x |
A dataframe containing the p-values, fdrs and log fold changes of all bins. |
count |
Read counts in each bin from paired input and IP sample. |
use |
A character to specify which criterion to select significant bins. It takes among "pval", "fdr", "lfc", "pval_lfc" and "fdr_lfc". "pval": The selection is only based on P-values; "fdr": The selection is only based on FDR; "lfc": The selection is only based on log fold changes between normalized IP and normalized input read counts; "pval_lfc": The selection is based on both p-values and log fold changes; "fdr_lfc": The selection is based on both FDR and log fold changes. Default is "pval". |
pval.cutoff |
A numerical value to specify a cutoff for p-value. Default is 1e-5. |
fdr.cutoff |
A numerical value to specify a cutoff for fdr. Default is 0.05. |
lfc.cutoff |
A numerical value to specify a cutoff for log fold change between normalized IP and input read counts. Default is 0.7 for fold change of 2. |
sep |
A constant used divide genome into consecutive sequenced regions. Any two bins with distance greater than sep will be grouped into different regions. Default is 2000. |
minlen |
A constant to select bumps who have minimum length of minlen. Default is 100. |
minCount |
A constant to select bumps who have at least minlen number of bins. Default is 3. |
dis.merge |
A constant. Any twp bumps with distance smaller than dis.merge would be merged. Default is 100. |
scorefun |
A character indicating a function used to assign a score for each bump base on p-values of all spanned bins. Default is "mean", meaning that the score is an average of bin-level p-values. |
sort |
A logical value indicating whether rank (TRUE) bumps with the score output from scorefun or not (FALSE). Default is TRUE. |
This function returns a dataframe containing the chromosome, start position, end position, length, strand, summit, total read counts (both IP and input) and score of each bump.
### Use example dataset "Basal" in TRESS ### to illustrate usage of this function data("Basal") bins = Basal$Bins$Bins Counts = Basal$Bins$Counts sf = Basal$Bins$sf colnames(Counts) dat = Counts[, 1:2] thissf = sf[1:2] ### pvals based on binomial test idx = rowSums(dat) > 0 Pvals = rep(1, nrow(dat)) Pvals[idx] = 1 - pbinom(dat[idx, 2], rowSums(dat[idx, ]), prob = 0.5) ### lfc c0 = mean(as.matrix(dat), na.rm = TRUE) ### pseudocount lfc = log((dat[, 2]/thissf[2] + c0)/(dat[, 1]/thissf[1] + c0)) x.vals = data.frame(pvals = Pvals, fdr = p.adjust(Pvals, method = "fdr"), lfc = lfc) ### find bumps based on pvals, fdr or lfc Bumps = findBumps(chr = bins$chr, pos = bins$start, strand = bins$strand, x = x.vals, use = "fdr_lfc", fdr.cutoff = 0.01, lfc.cutoff = 0.5, count = dat) head(Bumps, 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.