Description Usage Arguments Details Value Author(s) References See Also Examples
calcDenovo
estimates expression of gene splicing variants,
considering both known variants and variants that have not been
previously described.
1 2 3 4 | calcDenovo(distrs, targetGenomeDB, knownGenomeDB=targetGenomeDB, pc,
readLength, islandid, priorq=3, mprior, minpp=0.001, selectBest=FALSE,
searchMethod="submodels", niter, exactMarginal=TRUE,
integrateMethod="plugin", verbose=TRUE, mc.cores=1)
|
distrs |
List of fragment distributions as generated by the |
targetGenomeDB |
|
knownGenomeDB |
|
pc |
Named vector of exon path counts as returned by |
readLength |
Read length in bp, e.g. in a paired-end experiment where
75bp are sequenced on each end one would set |
islandid |
Name of the gene island to be analyzed. If not specified, all gene islands are analyzed. |
priorq |
Parameter of the prior distribution on the proportion of
reads coming from each variant. The prior is Dirichlet with prior
sample size for each variant equal to priorq.
We recommend |
mprior |
Prior on the model space returned by
|
minpp |
Models (i.e. splicing configurations) with posterior probability less than |
selectBest |
If set to |
searchMethod |
Method used to perform the model search.
|
niter |
Number of MCMC iterations. |
exactMarginal |
Set to |
integrateMethod |
Method to compute integrated likelihoods. The default
( |
verbose |
Set to |
mc.cores |
Number of processors to be used for parallel
computation. Can only be used if the package |
calcDenovo
explores which subset of the isoforms indicated in
targetGenomeDB
are truly expressed.
It also adds new isoforms when some reads follow an exon path that
is not possible under any of the isoforms in targetGenomeDB
.
calcDenovo
the posterior probability of each model
(i.e. configuration of expressed variants) via Bayes theorem.
P(model|y) "proportional to" m(y|model) P(model)
where m(y|model) is the integrated likelihood and P(model) is the
prior probability of the model.
For example, a gene with 20 predicted isoforms in targetGenome
gives rise 2^20 - 1 possible models (configurations of expressed isoforms).
Importantly, P(model) can be set by analyzing available genome
annotations in knownGenomeDB
.
For instance, genes with 20 exons have isoforms that tend
to use most of the 20 exons. They also tend to express more
isoforms than genes with 5 exons. The function modelPrior
analyzes knownGenomeDB
to set reasonable values for P(model).
An exhaustive enumeration of all possible models is
not feasible unless the gene is very short (e.g. around 5 exons).
For longer genes we use computational strategies to search a subset of
"interesting" models. This is controlled by the argument searchMethod
(see above).
In order to compute P(model|y) one can either use the computed
m(y|model) P(model) (option exactMarginal==TRUE
) or the
proportion of MCMC visits (option exactMarginal==FALSE
). Unless
niter
is large the former option typically provides more
precise estimates.
A denovoGenomeExpr object.
Camille Stephan-Otto Attolini, Manuel Kroiss, David Rossell
Rossell D, Stephan-Otto Attolini C, Kroiss M, Stocker A. Quantifying Alternative Splicing from Paired-End RNA-sequencing data. Annals of Applied Statistics, 8(1):309-330.
denovoExpr
to obtain expression estimates from the
calcDenovo
output.
plotExpr
to produce a plot with splicing variants and estimated expression.
1 | ## See help(denovoExpr)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.