knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(width=100)
The r Biocpkg("Rhisat2")
package provides an interface to the
HISAT2 short-read aligner
software [@Kim2015-mu].
Use the BiocManager
package to download and install the package from
Bioconductor as follows:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("Rhisat2")
If required, the latest development version of the package can also be installed from GitHub.
BiocManager::install("fmicompbio/Rhisat2")
Once the package is installed, load it into your R session:
library(Rhisat2)
To build an index for the alignment, use the hisat2_build
function. You need
to provide one or more fasta file with reference sequences, as well as an output
directory where the index will be stored, and a "prefix" (that will determine
the name of the index files in the output directory). Any additional arguments
to the hisat2-build
binary can also be supplied to the function (see
hisat2_build_usage()
or
https://ccb.jhu.edu/software/hisat2/manual.shtml#the-hisat2-build-indexer
for an overview of the available options).
The package contains three example reference sequences, corresponding to short
pieces of three different chromosomes. We show how to create an index (named
myindex
) based on these sequences.
list.files(system.file("extdata/refs", package="Rhisat2"), pattern="\\.fa$") refs <- list.files(system.file("extdata/refs", package="Rhisat2"), full.names=TRUE, pattern="\\.fa$") td <- tempdir() hisat2_build(references=refs, outdir=td, prefix="myindex", force=TRUE, strict=TRUE, execute=TRUE)
By instead setting execute=FALSE
in the command above, hisat2_build()
will
return the index building shell command for inspection, without executing it.
print(hisat2_build(references=refs, outdir=td, prefix="myindex", force=TRUE, strict=TRUE, execute=FALSE))
After creating the index, reads can be aligned using the hisat2
wrapper
function. Most commonly, the reads will be provided in fastq
files (one file
for single-end reads, two files for paired-end reads). The names of these files
can be provided directly to the hisat2
function, either as a vector (for
single-end reads) or as a list of length 2 (for paired-end reads, each list
element corresponds to one mate). You also need to provide the path to the index
generated by hisat2_build
(the output directory combined with the prefix), and
the output file name where the alignments should be written.
list.files(system.file("extdata/reads", package="Rhisat2"), pattern="\\.fastq$") reads <- list.files(system.file("extdata/reads", package="Rhisat2"), pattern="\\.fastq$", full.names=TRUE) hisat2(sequences=as.list(reads), index=file.path(td, "myindex"), type="paired", outfile=file.path(td, "out.sam"), force=TRUE, strict=TRUE, execute=TRUE)
As for hisat2_build()
, any additional arguments to the hisat2
binary can be
provided to the hisat2()
function (see hisat2_usage()
or
https://ccb.jhu.edu/software/hisat2/manual.shtml#running-hisat2
for an overview of the available options).
With HISAT2, you can provide a file with known splice sites at the alignment
step, which can help in finding the correct alignments across known splice
junctions. A text file in the required format can be generated using the
extract_splice_sites()
function, starting from an annotation file in gtf or
gff3 format, or from a GRanges
or TxDb
object.
spsfile <- tempfile() gtf <- system.file("extdata/refs/genes.gtf", package="Rhisat2") extract_splice_sites(features=gtf, outfile=spsfile) hisat2(sequences=as.list(reads), index=file.path(td, "myindex"), type="paired", outfile=file.path(td, "out_sps.sam"), `known-splicesite-infile`=spsfile, force=TRUE, strict=TRUE, execute=TRUE)
To get the version of HISAT2
:
hisat2_version()
To see the usage of hisat2_build()
:
hisat2_build_usage()
And similarly to see the usage of hisat2()
:
hisat2_usage()
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.