toGRanges: Convert dataset to GRanges

toGRangesR Documentation

Convert dataset to GRanges

Description

Convert UCSC BED format and its variants, such as GFF, or any user defined dataset such as MACS output file to GRanges

Usage

toGRanges(data, ...)

## S4 method for signature 'connection'
toGRanges(
  data,
  format = c("BED", "GFF", "GTF", "MACS", "MACS2", "MACS2.broad", "narrowPeak",
    "broadPeak", "CSV", "others"),
  header = FALSE,
  comment.char = "#",
  colNames = NULL,
  ...
)

## S4 method for signature 'TxDb'
toGRanges(
  data,
  feature = c("gene", "transcript", "exon", "CDS", "fiveUTR", "threeUTR", "microRNA",
    "tRNAs", "geneModel"),
  OrganismDb,
  ...
)

## S4 method for signature 'EnsDb'
toGRanges(
  data,
  feature = c("gene", "transcript", "exon", "disjointExons"),
  ...
)

## S4 method for signature 'character'
toGRanges(
  data,
  format = c("BED", "GFF", "GTF", "MACS", "MACS2", "MACS2.broad", "narrowPeak",
    "broadPeak", "CSV", "others"),
  header = FALSE,
  comment.char = "#",
  colNames = NULL,
  ...
)

Arguments

data

an object of data.frame, TxDb or EnsDb, or the file name of data to be imported. Alternatively, data can be a readable txt-mode connection (See ?read.table).

...

parameters passed to read.table

format

data format. If the data format is set to BED, GFF, narrowPeak or broadPeak, please refer to http://genome.ucsc.edu/FAQ/FAQformat#format1 for column order. "MACS" is for converting the excel output file from MACS1. "MACS2" is for converting the output file from MACS2. If set to CSV, must have columns: seqnames, start, end, strand.

header

A logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: header is set to TRUE if the first row contains one fewer field than the number of columns or the format is set to 'CSV'.

comment.char

character: a character vector of length one containing a single character or an empty string. Use "" to turn off the interpretation of comments altogether.

colNames

If the data format is set to "others", colname must be defined. And the colname must contain space, start and end. The column name for the chromosome # should be named as space.

feature

annotation type

OrganismDb

an object of OrganismDb. It is used for extracting gene symbol for geneModel group for TxDb

Value

An object of GRanges

Author(s)

Jianhong Ou

Examples


  macs <- system.file("extdata", "MACS_peaks.xls", package="ChIPpeakAnno")
  macsOutput <- toGRanges(macs, format="MACS")
  if(interactive() || Sys.getenv("USER")=="jianhongou"){
    ## MACS connection
    macs <- readLines(macs)
    macs <- textConnection(macs)
    macsOutput <- toGRanges(macs, format="MACS")
    close(macs)
    ## bed
    toGRanges(system.file("extdata", "MACS_output.bed", package="ChIPpeakAnno"),
                format="BED")
    ## narrowPeak
    toGRanges(system.file("extdata", "peaks.narrowPeak", package="ChIPpeakAnno"),
                format="narrowPeak")
    ## broadPeak
    toGRanges(system.file("extdata", "TAF.broadPeak", package="ChIPpeakAnno"),
                format="broadPeak")
    ## CSV
    toGRanges(system.file("extdata", "peaks.csv", package="ChIPpeakAnno"),
                format="CSV")
    ## MACS2
    toGRanges(system.file("extdata", "MACS2_peaks.xls", package="ChIPpeakAnno"),
                format="MACS2")
    ## GFF
    toGRanges(system.file("extdata", "GFF_peaks.gff", package="ChIPpeakAnno"),
                format="GFF")
    ## EnsDb
    library(EnsDb.Hsapiens.v75)
    toGRanges(EnsDb.Hsapiens.v75, feature="gene")
    ## TxDb
    library(TxDb.Hsapiens.UCSC.hg19.knownGene)
    toGRanges(TxDb.Hsapiens.UCSC.hg19.knownGene, feature="gene")
    ## data.frame
    macs <- system.file("extdata", "MACS_peaks.xls", package="ChIPpeakAnno")
    macs <- read.delim(macs, comment.char="#")
    toGRanges(macs)
  }


jianhong/ChIPpeakAnno documentation built on Jan. 4, 2025, 5:27 p.m.