View source: R/make_cutadapt.R
make_cutadapt | R Documentation |
make_cuadapt
cutadapt/fastq_quality_filter
make_cutadapt(input, output, parse = NULL, threads = 1)
input |
Character path to a directory containing input fastq-files. The script will recursively search this directory for the .fastq|.fastq.gz extension. |
output |
Character path to the output directory where trimmed fastq files will be stored and temporary files will be generated. |
parse |
List with two character string expressions. The first will be parsed to cutadapt while the other is be parsed to fastq_quality_filter. If any is NULL, then the function will not pass the command and the trimming or filtering will not be applied. Thus, if parse = list(cutadapt=NULL, fastq_quality_filter="-q 20 -p 80"), then only the quality filter will be applied. |
threads |
Integer stating the number of parallel jobs. Note, that
reading multiple fastq files drains memory fast, using up to 10Gb per fastq
file. To avoid crashing the system due to memory shortage, make sure that
each thread on the machine have at least 10 Gb of memory availabe, unless
your fastq files are very small. Use |
Given a path to sequence files in fastq format this function will trim adaptor and remove sequences with low quality.
Externally the function will generate trimmed and/or quality filtered fastq files in the output folder. Internally, a list of logs that can be used to generate a progress report is returned.
https://cutadapt.readthedocs.io/en/stable/ for download and documentation on cutadapt. http://hannonlab.cshl.edu/fastx_toolkit/commandline.html for download and documentation on fastq_quality_filter. https://github.com/Danis102 for updates on seqpac.
Other PAC generation:
PAC_check()
,
create_PAC()
,
make_PAC()
,
make_counts()
,
make_pheno()
,
make_trim()
,
merge_lanes()
############################################################
### Principle of trimming using the make_cutadapt function
### (Important: Need external installations of cutadapt
### and fastq_quality_filter to work)
#
# input = "/some/path/to/input/folder"
# output = "/some/path/to/output/folder"
#
## Parse for make_cutadapt is a list of 2 character string expressions.
## The first is parsed to cutadapt and the other to fastq_quality_filter
## For parallel processes '-j 1' is recommended since seqpac will
## parallelize across samples and not within.
## Run system2("cutadapt -h", stdout=TRUE) and
## system("fastq_quality_filter -h", stdout=TRUE)
## for more options.
#
## String to parse to cutadapt:
# cut_prs <- paste0("-j 1 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACAT",
# " --discard-untrimmed --nextseq-trim=20",
# " -O 10 -m 7 -M 70")
#
## Add string to parse to fastq_quality_filter:
# parse = list(
# cutadapt=cut_prs,
# fastq_quality_filter="-q 20 -p 80")
#
# logs <- make_cutadapt(input, output, threads=8, parse=parse)
#' # Clean up temp
closeAllConnections()
fls_temp <- list.files(tempdir(), recursive=TRUE, full.names = TRUE)
file.remove(fls_temp, showWarnings=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.