align_target: Align microbiome reads to a set of reference libraries

View source: R/align_target.R

align_targetR Documentation

Align microbiome reads to a set of reference libraries


This is the main MetaScope target library mapping function, using Rsubread and multiple libraries. Aligns to each library separately, filters unmapped reads from each file, and then merges and sorts the .bam files from each library into one output file. If desired, output can be passed to 'filter_host()' to remove reads that also map to filter library genomes.


  read2 = NULL,
  lib_dir = NULL,
  threads = 1,
  align_file = tools::file_path_sans_ext(read1),
  subread_options = align_details,
  quiet = TRUE



Path to the .fastq file to align.


Optional: Location of the mate pair .fastq file to align.


Path to the index files for all libraries.


A vector of character strings giving the basenames of the Subread index files for alignment. If ALL indices to be used are located in the current working directory, set lib_dir = NULL. Default is lib_dir = NULL.


The number of threads that can be utilized by the function. Default is 1 thread.


The basename of the output alignment file (without trailing .bam extension).


A named list specifying alignment parameters for the Rsubread::align() function, which is called inside align_target(). Elements should include type, nthreads, maxMismatches, nsubreads, phredOffset, unique, and nBestLocations. Descriptions of these parameters are available under ?Rsubread::align. Defaults to the global align_details object.


Turns off most messages. Default is TRUE.


This function writes a merged and sorted .bam file after aligning to all reference libraries given, along with a summary report file, to the user's working directory. The function also outputs the new .bam filename.


#### Align example reads to an example reference library using Rsubread

## Create temporary directory
target_ref_temp <- tempfile()

tax <- "Ovine atadenovirus D"

## Create temporary taxonomizr accession
tmp_accession <- system.file("extdata", "example_accessions.sql", package = "MetaScope")

## Download genome
all_ref <- MetaScope::download_refseq(tax,
                                      reference = FALSE,
                                      representative = FALSE,
                                      compress = TRUE,
                                      out_dir = target_ref_temp,
                                      caching = TRUE,
                                      accession_path = tmp_accession)

## Create subread index
ind_out <- mk_subread_index(all_ref)

## Get path to example reads
readPath <- system.file("extdata", "reads.fastq",
                        package = "MetaScope")
## Copy the example reads to the temp directory
refPath <- file.path(target_ref_temp, "reads.fastq")
file.copy(from = readPath, to = refPath)

## Modify alignment parameters object
align_details[["type"]] <- "rna"
align_details[["phredOffset"]] <- 50
# Just to get it to align - toy example!
align_details[["maxMismatches"]] <- 100

## Run alignment
target_map <- align_target(refPath,
                           libs = stringr::str_replace_all(tax, " ", "_"),
                           lib_dir = target_ref_temp,
                           subread_options = align_details)

## Remove temporary folder
unlink(target_ref_temp, recursive = TRUE)

compbiomed/MetaScope documentation built on Jan. 16, 2025, 10:23 p.m.