Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/constructBins.R
Preprocess and construct bin-level ChIP-sep data from an aligned read file.
1 2 3 |
infile |
Name of the aligned read file to be processed. |
fileFormat |
Format of the aligned read file to be processed.
Currently, |
outfileLoc |
Directory of processed bin-level files. By default, processed bin-level files are exported to the current directory. |
byChr |
Construct separate bin-level file for each chromosome?
Possible values are |
useChrfile |
Is the file for chromosome info provided?
Possible values are |
chrfile |
Name of the file for chromosome info. In this file, the first and second columns are ID and size of each chromosome, respectively. |
excludeChr |
Vector of chromosomes that will be excluded from the analysis.
This argument is ignored if |
PET |
Is the file paired-end tag (PET) data?
If |
fragLen |
Average fragment length. Default is 200.
This argument is ignored if |
binSize |
Size of bins. Default is 200. |
capping |
Maximum number of reads allowed to start at each nucleotide position.
To avoid potential PCR amplification artifacts, the maximum number of reads
that can start at a nucleotide position is capped at |
perl |
Name of the perl executable to be called. Default is |
Bin-level files are constructed from the aligned read file and
exported to the directory specified in outfileLoc
argument.
If byChr=FALSE
, bin-level files are named
as [infileName]_fragL[fragLen]_bin[binSize].txt
for SET data (PET = FALSE
)
and [infileName]_bin[binSize].txt
for PET data (PET = TRUE
).
If byChr=TRUE
, bin-level files are named
as [infileName]_fragL[fragLen]_bin[binSize]_[chrID].txt
for SET data (PET = FALSE
)
and [infileName]_bin[binSize]_[chrID].txt
for PET data (PET = TRUE
),
where chrID
is chromosome IDs that reads align to.
These chromosome IDs are extracted from the aligned read file.
If the file for chromosome information is provided (useChrfile=TRUE
and chrfile
is not NULL),
only the chromosomes specified in the file will be considered.
Chromosomes that are specified in excludeChr
will not be included in the processed bin-level files.
excludeChr
argument is ignored if useChrfile=TRUE
.
Constructed bin-level files can be loaded into the R environment using the method readBins
.
constructBins
currently supports the following aligned read file formats
for SET data (PET = FALSE
):
Eland result ("eland_result"
), Eland extended ("eland_extended"
),
Eland export ("eland_export"
), default Bowtie ("bowtie"
),
SAM ("sam"
), "bam"
(BAM), BED ("bed"
), and CSEM ("csem"
).
For PET data (PET = TRUE
), the following aligned read file formats are allowed:
"eland_result"
(Eland result), "sam"
(SAM), and "bam"
(BAM).
If input file format is neither BED nor CSEM BED, this method retains only reads mapping uniquely to the reference genome.
Processed bin-level files are exported to the directory specified in outfileLoc
.
Dongjun Chung, Pei Fen Kuan, Rene Welch, Sunduz Keles
Kuan, PF, D Chung, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.
Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## Not run:
library(mosaicsExample)
constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"),
fileFormat="bam", outfileLoc="~/",
byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL,
PET=FALSE, fragLen=200, binSize=200, capping=0 )
constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"),
fileFormat="bam", outfileLoc="~/",
byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL,
PET=FALSE, fragLen=200, binSize=200, capping=0 )
binHM <- readBins( type=c("chip","input"),
fileName=c( "~/wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt",
"~/wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt" ) )
binHM
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.