Description Usage Arguments Value
For a given set of sites with the same/comparable width, their
read count table from multiple samples are adjusted based on
potential GC effects. For each sample separately, GC effects are
estimated based on their effective GC content and
reads count using generalized linear mixture models. Then, count
table is adjusted based on estimated GC effects.
It it important that the given sites includes both foreground and
background regions, see sites
below.
1 2 3 4 5 |
counts |
A count matrix with each row corresponding to each element
in |
sites |
A GRanges object with length equivalent to number of rows
in |
flank |
A non-negative integer specifying the flanking width of ChIP-seq binding. This parameter provides the flexibility that reads appear in flankings by decreased probabilities as increased distance from binding region. This paramter helps to define effective GC content calculation. |
outputidx |
A logical vector with the length equivalent to number
of rows in |
gcrange |
A non-negative numeric vector with length 2. This vector sets the range of GC content to filter regions for GC effect estimation. For human, most regions have GC content between 0.3 and 0.8, which is set as the default. Other regions with GC content beyond this range will be ignored. This range is critical when very few foreground regions are selected for mixture model fitting, since outliers could drive the regression lines. Thus, if possible, first make a scatter plot between counts and GC content to decide this parameter. Alternatively, select a narrower range, e.g. c(0.35,0.7), to aviod outlier effects from both high and low GC-content regions. |
emtrace |
A logical vector which, when TRUE (default), allows to print the trace of log likelihood changes in EM iterations. |
plot |
A logical vector which, when TRUE (default), returns miture fitting plot. |
model |
A character specifying the distribution model to be used in
generalized linear model fitting. The default is negative
binomial( |
mu0 |
A non-negative numeric initiating read count signals for background sites. This is treated as the starting value of background mean for poisson/nbinom fitting. |
mu1 |
A non-negative numeric initiating read count signals for foreground sites. This is treated as the starting value of foreground mean for poisson/nbinom fitting. |
theta0 |
A non-negative numeric initiating the shape parameter of
negative binomial model for background sites. For more detail, see
theta in |
theta1 |
A non-negative numeric initiating the shape parameter of
negative binomial model for foreground sites. For more detail, see
theta in |
p |
A non-negative numeric specifying the proportion of foreground sites in all estimated sites. This is treated as a starting value for EM algorithm. |
converge |
A non-negative numeric specifying the condition of EM
algorithm termination. EM algorithm stops when the ratio of log likelihood
increment to whole log likelihood is less or equivalent to
|
genome |
A BSgenome object containing the sequences
of the reference genome that was used to align the reads, or the name of
this reference genome specified in a way that is accepted by the
|
gctype |
A character vector specifying choice of method to calculate
effective GC content. Default |
The count matrix after GC adjustment. The matrix values are not integer any more.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.