Description Usage Arguments Details Value Author(s) References See Also Examples
Function to analyze bam files to generate an ExpressionSet with expression estimates for all samples, read start and fragment length distributions, path counts and optinally processed reads.
1 2 3 4 |
bamFile |
Names of bam files with the sample to analyze. These must sorted and indexed, and the index must be in the same directory. |
verbose |
Set to |
seed |
Set seed of random number generator. |
mc.cores.int |
Number of cores to use when loading bam files. This is a memory intensive step, therefore number of cores must be chosen according to available RAM memory. |
mc.cores |
Number of cores to use in expression estimation. |
genomeDB |
|
readLength |
Read length in bp, e.g. in a paired-end experiment where
75bp are sequenced on each end one would set |
rpkm |
Set to |
priorq |
Parameter of the prior distribution on the proportion of reads
coming from each variant. The prior is Dirichlet with prior sample
size for each variant equal to priorq.
We recommend |
priorqGeneExpr |
Parameter for prior distribution on overall gene expression. Defaults to 2, which ensures non-zero estimates for all genes |
citype |
Set to |
niter |
Number of Monte Carlo iterations. Only used when |
burnin |
Number of burnin Monte Carlo iterations. Only used when |
keep.pbam |
Set to |
keep.multihits |
Set to |
chroms |
Manually set chromosomes to be processed. By default only main chromosomes are considered (except 'chrM') |
The function executes the functions procBam
, getDistrs
and pathCounts
in parallel for each chromosome, but is much more efficient in cpu
speed and memory usage than running these functions separately.
Data from multiple samples are then combined using mergeExp
.
Note that further normalization (e.g. quantileNorm
)
may be needed preliminary to actual data analysis.
When rpkm
is false the function returns the estimated
proportion of reads arising from each isoform within a gene island.
casper groups two or more genes into a gene island whenever these
genes share an exon (or part of an exon). Because exons are shared,
isoform quantification must be done simultaneously for all those
genes.
That is, the output from wrapKnown
when rpkm
is FALSE
are proportions that add up
to 1 within each island. If you would like to re-normalize these
expressions so that they add up to 1 within each gene, see the help
for function relexprByGene
.
One last remark: casper returns the estimated proportion of reads
generated by each isoform, which is not the same as relative
isoform expressions. Longer isoforms tend to
produce more reads than shorter isoforms. This is easily accounted for
by dividing relative expressions by isoform length, see relexprByGene
.
distr |
Object of class |
pbam |
List of objects of class |
pc |
Object of class |
exp |
Object of class |
Camille Stephan-Otto Attolini, David Rossell
Rossell D, Stephan-Otto Attolini C, Kroiss M, Stocker A. Quantifying Alternative Splicing from Paired-End RNA-sequencing data. Annals of Applied Statistics, 8(1):309-330.
procGenome
, relexprByGene
, quantileNorm
1 2 3 4 5 6 | ## genDB<-makeTranscriptDbFromUCSC(genome="hg19", tablename="refGene")
## hg19DB <- procGenome(genDB, "hg19")
## bamFile="/path_to_bam/sorted.bam"
## ans <- wrapKnown(bamFile=bamFile, mc.cores.int=4, mc.cores=3, genomeDB=hg19DB, readLength=101)
## names(ans)
## head(exprs(ans\$exp))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.