Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/mosaicsRunAll.R
Construct bin-level ChIP-sep data from aligned read files of ChIP and matched control samples, fit a MOSAiCS model, call peaks, export peak calling results, and generate reports for diagnostics.
1 2 3 4 5 6 7 8 9 10 11 12 | mosaicsRunAll(
chipFile=NULL, chipFileFormat=NULL,
controlFile=NULL, controlFileFormat=NULL,
binfileDir=NULL,
peakFile=NULL, peakFileFormat=NULL,
reportSummary=FALSE, summaryFile=NULL,
reportExploratory=FALSE, exploratoryFile=NULL,
reportGOF=FALSE, gofFile=NULL,
PET=FALSE, byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL,
FDR=0.05, fragLen=200, binSize=200, capping=0, bgEst="rMOM", d=0.25,
signalModel="BIC", maxgap=200, minsize=50,
thres=10, parallel=FALSE, nCore=8 )
|
chipFile |
Name of the aligned read file of ChIP sample to be processed. |
chipFileFormat |
Format of the aligned read file of ChIP sample to be processed.
Currently, |
controlFile |
Name of the aligned read file of matched control sample to be processed. |
controlFileFormat |
Format of the aligned read file of matched control sample to be processed.
Currently, |
binfileDir |
Directory to store processed bin-level files. |
peakFile |
Name of the peak list generated from the analysis. |
peakFileFormat |
Format of the peak list generated from the analysis.
Possible values are |
reportSummary |
Report the summary of model fitting and peak calling?
Possible values are |
summaryFile |
File name of the summary report of model fitting and peak calling. The summary report is a text file. |
reportExploratory |
Report the exploratory analysis plots?
Possible values are |
exploratoryFile |
Name of the file for exploratory analysis plots. The exploratory analysis results are exported as a PDF file. |
reportGOF |
Report the goodness of fit (GOF) plots?
Possible values are |
gofFile |
Name of the file for goodness of fit (GOF) plots. The GOF plots are exported as a PDF file. |
PET |
Is the file paired-end tag (PET) data?
If |
byChr |
Analyze ChIP-seq data for each chromosome separately or analyze it genome-wide?
Possible values are |
useChrfile |
Is the file for chromosome info provided?
Possible values are |
chrfile |
Name of the file for chromosome info. In this file, the first and second columns are ID and size of each chromosome, respectively. |
excludeChr |
Vector of chromosomes that will be excluded from the analysis. |
FDR |
False discovery rate. Default is 0.05. |
fragLen |
Average fragment length. Default is 200. |
binSize |
Size of bins. Default is 200. |
capping |
Maximum number of reads allowed to start at each nucleotide position.
To avoid potential PCR amplification artifacts, the maximum number of reads
that can start at a nucleotide position is capped at |
bgEst |
Parameter to determine background estimation approach.
Possible values are "matchLow" (estimation using bins with low tag counts) and
"rMOM" (estimation using robust method of moment (MOM)).
If |
d |
Parameter for estimating background distribution. Default is 0.25. |
signalModel |
Signal model. Possible values are "BIC" (automatic model selection using BIC), "1S" (one-signal-component model), and "2S" (two-signal-component model). Default is "BIC". |
maxgap |
Initial nearby peaks are merged if the distance (in bp)
between them is less than |
minsize |
An initial peak is removed if its width is narrower than |
thres |
A bin within initial peak is removed if its ChIP tag counts are less than |
parallel |
Utilize multiple CPUs for parallel computing
using |
nCore |
Number of maximum number of CPUs used for the analysis. Default is 8. |
This method implements the work flow for the two-sample analysis of ChIP-seq data
using the MOSAiCS framework (without using mappability and GC content scores).
It imports aligned read files of ChIP and matched control samples,
processes them into bin-level files, fits MOSAiCS model, calls peaks,
exports the peak lists to text files, and generates reports for diagnostics.
This method is a wrapper function of constructBins
, readBins
,
mosaicsFit
, mosaicsPeak
, export
functions, and methods of
BinData
, MosaicsFit
, and MosaicsPeak
classes.
See the vignette of the package for the illustration of the work flow
and the description of employed methods and their options.
Exploratory analysis plots and goodness of fit (GOF) plots are generated
using the methods plot
of the classes BinData
and MosaicsFit
, respectively.
See the help of constructBins
for details of the options PET
, chipFileFormat
,
controlFileFormat
, byChr
, useChrfile
, chrfile
, excludeChr
,
fragLen
, binSize
, and capping
.
See the help of mosaicsFit
for details of the options bgEst
and d
.
See the help of mosaicsPeak
for details of the options FDR
,
signalModel
, maxgap
, minsize
, and thres
.
See the help of export
for details of the option peakFileFormat
.
When the data contains multiple chromosomes,
parallel computing can be utilized for faster preprocessing and model fitting
if parallel=TRUE
and parallel
package is loaded.
nCore
determines number of CPUs used for parallel computing.
Processed bin-level files are exported to the directory specified in binfileDir
argument.
If byChr=FALSE
(genome-wide analysis),
one bin-level file is generated for each of ChIP and matched control samples,
where file names are [chipFile]_fragL[fragLen]_bin[binSize].txt
and [controlFile]_fragL[fragLen]_bin[binSize].txt
, respectively, for SET data (PET = FALSE
).
For PET data (PET = TRUE
), file names for each of ChIP and matched control samples
are [chipFile]_bin[binSize].txt
and [controlFile]_bin[binSize].txt
, respectively.
If byChr=TRUE
(chromosome-wise analysis),
bin-level files are generated for each chromosome of each of ChIP and matched control samples,
where file names are [chipFile]_fragL[fragLen]_bin[binSize]_[chrID].txt
and [controlFile]_fragL[fragLen]_bin[binSize]_[chrID].txt
, respectively, for SET data (PET = FALSE
)
([chrID]
is chromosome IDs that reads align to).
For PET data (PET = TRUE
), file names for each of ChIP and matched control samples
are [chipFile]_bin[binSize]_[chrID].txt
and [controlFile]_bin[binSize]_[chrID].txt
, respectively.
The peak list generated from the analysis are exported
to the file with the name specified in peakFile
.
If reportSummary=TRUE
, the summary of model fitting and peak calling is exported
to the file with the name specified in summaryFile
(text file).
If reportExploratory=TRUE
, the exploratory analysis plots are exported
to the file with the name specified in exploratoryFile
(PDF file).
If reportGOF=TRUE
, the goodness of fit (GOF) plots are exported
to the file with the name specified in gofFile
(PDF file).
Dongjun Chung, Pei Fen Kuan, Rene Welch, Sunduz Keles
Kuan, PF, D Chung, G Pan, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.
Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.
constructBins
, readBins
,
mosaicsFit
, mosaicsPeak
, export
,
BinData
, MosaicsFit
, MosaicsPeak
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ## Not run:
# minimal input (without any reports for diagnostics)
mosaicsRunAll(
chipFile = "/scratch/eland/STAT1_eland_results.txt",
chipFileFormat = "eland_result",
controlFile = "/scratch/eland/input_eland_results.txt",
controlFileFormat = "eland_result",
binfileDir = "/scratch/bin/",
peakFile = "/scratch/peak/STAT1_peak_list.bed",
peakFileFormat = "bed" )
# generate all reports for diagnostics
library(parallel)
mosaicsRunAll(
chipFile = "/scratch/eland/STAT1_eland_results.txt",
chipFileFormat = "eland_result",
controlFile = "/scratch/eland/input_eland_results.txt",
controlFileFormat = "eland_result",
binfileDir = "/scratch/bin/",
peakFile = "/scratch/peak/STAT1_peak_list.bed",
peakFileFormat = "bed",
reportSummary = TRUE,
summaryFile = "/scratch/reports/mosaics_summary.txt",
reportExploratory = TRUE,
exploratoryFile = "/scratch/reports/mosaics_exploratory.pdf",
reportGOF = TRUE,
gofFile = "/scratch/reports/mosaics_GOF.pdf",
PET = FALSE, byChr = FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr = "chrM",
FDR = 0.05, fragLen = 200, capping = 0, bgEst="automatic", thres=10,
parallel = TRUE, nCore = 8 )
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.