mergeBamByFactor: Merge BAM files based on factor

View source: R/chipseq.R

mergeBamByFactorR Documentation

Merge BAM files based on factor

Description

Merges BAM files based on sample groupings provided by a factor using internally the mergeBam function from the Rsamtools package. The function also returns an updated SYSargs or SYSargs2 object containing the paths to the merged BAM files as well as to the unmerged BAM files if there are any. All rows of merged parent samples are removed. When a named character vector is provided as input, a data.frame with a target containing the paths to the merged BAM files as output.

The functionality provided by mergeBamByFactor is useful for experiments where pooling of replicates is advantageous to maximize the depth of read coverage, such as prior to peak calling in ChIP-Seq or miRNA gene prediction experiments.

Usage

mergeBamByFactor(args, targetsDF = NULL, mergefactor = "Factor",
                out_dir = file.path("results", "merge_bam"),
                overwrite = FALSE, silent = FALSE, ...)

Arguments

args

An instance of SYSargs or SYSargs2 constructed from a targets file where the first column (targetsin(args) or targets.as.df(targets(args))) contains the paths to the BAM files along with the column title FileName. Another possibily is named character vector with BAM files PATH and the elements names should be the sampleID.

targetsDF

This argument is required when named character vector is provided as input. Default is NULL. Object of class DFrame, and it can be obtained with targetsWF(<SYSargsList>).

mergefactor

factor containing the grouping information required for merging the BAM files referenced in the first column of targetsin(args) or targets.as.df(targets(args)). The default uses Factor column from the targets files as factor. The latter merges BAM files for which replicates are specified in the Factor column.

out_dir

The directory path to store merged bam files. Default uses "merge_bam" directory inside the results directory. directory not existing before running the function is alllowed. It will be created while running.

overwrite

If overwrite=FALSE existing BAM files of the same name will not be overwritten.

silent

If silent=TRUE print statements will be suppressed.

...

To pass on additional arguments to the internally used mergeBam function from Rsamtools.

Value

The merged BAM files will be written to output files with the following naming convention: <first_BAM_file_name>_<grouping_label_of_factor>.<bam>. In addition, the function returns an updated SYSargs or SYSargs2 object where all output file paths contain the paths to the merged BAM files. When a named character vector is provided as input, a data.frame with a target containing the paths to the merged BAM files as output. The rows of the merged parent samples are removed and the rows of the unmerged samples remain unchanged.

Author(s)

Thomas Girke

See Also

writeTargetsout, writeTargetsRef

Examples

## Construct initial SYSargs object
targetspath <- system.file("extdata", "targets_chip.txt", package="systemPipeR")
parampath <- system.file("extdata", "bowtieSE.param", package="systemPipeR")
args <- systemArgs(sysma=parampath, mytargets=targetspath)

## Not run: 
## After running alignmets (e.g. with Bowtie2) generate targets file
## for the corresponding BAM files. The alignment step is skipped here.
writeTargetsout(x=args, file="targets_bam.txt", overwrite=TRUE)
args <- systemArgs(sysma=NULL, mytargets="targets_bam.txt")

## Merge BAM files and return updated SYSargs object
args_merge <- mergeBamByFactor(args, overwrite=TRUE, silent=FALSE)

## Export modified targets file
writeTargetsout(x=args_merge, file="targets_mergeBamByFactor.txt", overwrite=TRUE)

## End(Not run)

tgirke/systemPipeR documentation built on Sept. 24, 2024, 9:48 a.m.