Description Usage Arguments Value Examples
GC effects are estimated based on effective GC content and reads count on genome-wide windows, using generalized linear mixture models. Genome wide windows are randomly or supervised sampled with given proportions. GC effects of background and foreground are estimated separately.
1 2 3 4 5 |
coverage |
A list object returned by function |
bdwidth |
A non-negative integer vector with two elements
specifying ChIP-seq binding width and peak detection half window size.
Usually generated by function |
flank |
A non-negative integer specifying the flanking width of
ChIP-seq binding. This parameter provides the flexibility that reads
appear in flankings by decreased probabilities as increased distance
from binding region. This paramter helps to define effective GC
content calculation. Default is NULL, which means this paramater will
be calculated from |
plot |
A logical vector which, when TRUE (default), returns plots of intermediate results. |
sampling |
A numeric vector with length 2. The first number specifies the proportion of regions to be sampled for GC effects estimation. The second number specifies the repeat times for sampling. Default c(0.05,1) gives pretty robust estimation for human genome. However, smaller genomes might need both higher proportion and more repeat times for robust estimation. |
supervise |
A GRanges object specifying peak regions in the studied data, such as peaks called by peak callers, e.g. MACS & SPP. These peak regions provide supervised window sampling for both mixtures in the generalized linear model. Default no supervising. Or, if provided peak regions have too few covered windows, supervised sampling will be replaced by random sampling automatically. |
gcrange |
A non-negative numeric vector with length 2. This vector sets the range of GC content to filter regions for GC effect estimation. For human, most regions have GC content between 0.3 and 0.8, which is set as the default. Other regions with GC content beyond this range will be ignored. This range is critical when very few foreground regions are selected for mixture model fitting, since outliers could drive the regression lines. Thus, if possible, first make a scatter plot between counts and GC content to decide this parameter. Alternatively, select a narrower range, e.g. c(0.35,0.7), to aviod outlier effects from both high and low GC-content regions. |
emtrace |
A logical vector which, when TRUE (default), allows to print the trace of log likelihood changes in EM iterations. |
model |
A character specifying the distribution model to be used in
generalized linear model fitting. The default is negative
binomial( |
mu0 |
A non-negative numeric initiating read count signals for background regions. This is treated as the starting value of background mean for poisson/nbinom fitting. Default is 1. |
mu1 |
A non-negative numeric initiating read count signals for foreground regions. This is treated as the starting value of foreground mean for poisson/nbinom fitting, Default is 50. |
theta0 |
A non-negative numeric initiating the shape parameter of
negative binomial model for background regions. For more detail, see
theta in |
theta1 |
A non-negative numeric initiating the shape parameter of
negative binomial model for foreground regions. For more detail, see
theta in |
p |
A non-negative numeric specifying the proportion of foreground regions in all estimated regions. This is treated as a starting value for EM algorithm. Default is 0.02. |
converge |
A non-negative numeric specifying the condition of EM
algorithm termination. EM algorithm stops when the ratio of log likelihood
increment to whole log likelihood is less or equivalent to
|
genome |
A BSgenome object containing the sequences
of the reference genome that was used to align the reads, or the name of
this reference genome specified in a way that is accepted by the
|
gctype |
A character vector specifying choice of method to calculate
effective GC content. Default |
A list of objects
gc |
The GC contents at which GC effects are estimated. |
mu0 |
Predicted background signals at GC content |
mu1 |
Predicted foreground signals at GC content |
mu0med0 |
Median of predicted background signals. |
mu1med1 |
Median of predicted foreground signals. |
mu0med1 |
Median of predicted background signals at GC content of foreground windows. |
mu1med0 |
Median of predicted foreground signals at GC content of background windows. |
1 2 3 4 | bam <- system.file("extdata", "chipseq.bam", package="gcapc")
cov <- read5endCoverage(bam)
bdw <- bindWidth(cov)
gcb <- gcEffects(cov, bdw, sampling = c(0.15,1))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.