assignReads: Assign reads to the edges of a SplicingGraphs object

View source: R/assignReads.R

assignReadsR Documentation

Assign reads to the edges of a SplicingGraphs object

Description

assignReads assigns reads to the exonic and intronic edges of a SplicingGraphs object.

removeReads removes all the reads assigned to the exonic and intronic edges of a SplicingGraphs object.

Usage

assignReads(sg, reads, sample.name=NA)

removeReads(sg)

Arguments

sg

A SplicingGraphs object.

reads

A GAlignments, GAlignmentPairs, or GRangesList object, containing the reads to assign to the exons and introns in sg. It must have unique names on it, typically the QNAME ("query name") field coming from the BAM file. More on this in the 'About the read names' section below.

sample.name

A single string containing the name of the sample where the reads are coming from.

Details

TODO

Value

For assignReads: the supplied SplicingGraphs object with the reads assigned to it.

For removeReads: the supplied SplicingGraphs object with all reads previously assigned with assignReads removed from it.

About read names

The read names are typically imported from the BAM file by calling readGAlignments (or readGAlignmentPairs) with use.names=TRUE. This extracts the "query names" from the file (stored in the QNAME field), and makes them the names of the returned object.

The reads object must have unique names on it. The presence of duplicated names generally indicates one (or both) of the following situations:

  • (a) reads contains paired-end reads that have not been paired;

  • (b) some of the reads are secondary alignments.

If (a): you can find out whether reads in a BAM file are single- or paired-end with the quickBamFlagSummary utility from the Rsamtools package. If they're paired-end, load them with readGAlignmentPairs instead of readGAlignments, and that will pair them.

If (b): you can filter out secondary alignments by passing 'isSecondaryAlignment=FALSE' to scanBamFlag when preparing the ScanBamParam object used to load the reads. For example:

    flag0 <- scanBamFlag(isSecondaryAlignment=FALSE,
                         isNotPassingQualityControls=FALSE,
                         isDuplicate=FALSE)
    param0 <- ScanBamParam(flag=flag0)
    reads <- readGAlignments("path/to/BAM/file", use.names=TRUE,
                                  param=param0)
  

This will filter out records that have flag 0x100 (secondary alignment) set to 1. See ?scanBamFlag in the Rsamtools package for more information. See the SAM Specs on the SAMtools project page at http://samtools.sourceforge.net/ for a description of the SAM/BAM flags.

Author(s)

H. Pagès

See Also

This man page is part of the SplicingGraphs package. Please see ?`SplicingGraphs-package` for an overview of the package and for an index of its man pages.

Other topics related to this man page and documented in other packages:

  • The GRangesList class defined in the GenomicRanges package.

  • The GAlignments and GAlignmentPairs classes, as well as the readGAlignments and readGAlignmentPairs functions defined in the GenomicAlignments package.

  • The quickBamFlagSummary and ScanBamParam functions defined in the Rsamtools package.

Examples

## ---------------------------------------------------------------------
## 1. Make SplicingGraphs object 'sg' from toy gene model (see
##    '?SplicingGraphs')
## ---------------------------------------------------------------------
example(SplicingGraphs) 
sg

## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids.
names(sg)

## ---------------------------------------------------------------------
## 2. Load toy reads
## ---------------------------------------------------------------------
## Load toy reads (single-end) from a BAM file. We filter out secondary
## alignments, reads not passing quality controls, and PCR or optical
## duplicates (see ?scanBamFlag in the Rsamtools package for more
## information):
flag0 <- scanBamFlag(isSecondaryAlignment=FALSE,
                     isNotPassingQualityControls=FALSE,
                     isDuplicate=FALSE)
param0 <- ScanBamParam(flag=flag0)
gal <- readGAlignments(toy_reads_bam(), use.names=TRUE, param=param0)
gal

## ---------------------------------------------------------------------
## 3. Assign the reads to the exons and introns in 'sg'
## ---------------------------------------------------------------------
## The same read can be assigned to more than 1 exon or intron (e.g. a
## junction read with 1 gap can be assigned to 2 exons and 1 intron).
sg <- assignReads(sg, gal, sample.name="TOYREADS")

## See the assignments to the splicing graph edges.
edge_by_tx <- sgedgesByTranscript(sg, with.hits.mcols=TRUE)
edge_data <- mcols(unlist(edge_by_tx))
colnames(edge_data)
head(edge_data)
edge_data[ , c("sgedge_id", "TOYREADS.hits")]

edge_by_gene <- sgedgesByGene(sg, with.hits.mcols=TRUE)
mcols(unlist(edge_by_gene))

## See the assignments to the reduced splicing graph edges.
redge_by_gene <- rsgedgesByGene(sg, with.hits.mcols=TRUE)
mcols(unlist(redge_by_gene))

## ---------------------------------------------------------------------
## 4. Summarize the reads assigned to 'sg' and eventually remove them
## ---------------------------------------------------------------------
## See '?countReads'.

Bioconductor/SplicingGraphs documentation built on Nov. 3, 2024, 8:11 a.m.