recipeMake: recipeMake

View source: R/recipeMake.R

recipeMakeR Documentation

recipeMake

Description

Constructor function of data recipe

Usage

recipeMake(
  shscript = character(),
  paramID = c(),
  paramType = c(),
  outputID = c(),
  outputType = c("File[]"),
  outputGlob = character(0),
  requireTools = character(0)
)

Arguments

shscript

character string. Can take either the file path to the user provided shell script, or directly the script content, that are to be converted into a data recipe.

paramID

Character vector. The user specified parameter ID for the recipe.

paramType

Character vector specifying the type for each paramID. One parameter can be of multiple types in list. Valid values are "int" for integer, "boolean" for boolean, "float" for numeric, "File" for file path, "File[]" for an array of files, etc. Can also take "double", "long", "null", "Directory". See details.

outputID

the ID for each output.

outputType

the output type for each output.

outputGlob

the glob pattern of output files. E.g., "hg19.*".

requireTools

the command-line tools to be used for data processing/curation in the user-provided shell script. The value here must exactly match the tool name. E.g., "bwa", "samtools", etc. A particular version of that tool can be specified in the format of "tool=version", e.g., "samtools=1.3".

Details

For parameter types, more details can be found here: "https://www.commonwl.org/v1.2/CommandLineTool.html#CWLType".

recipeMake is a convenient function for wrapping a shell script into a data recipe (in cwlProcess S4 class). Please use Rcwl::cwlProcess for more options and functionalities, especially when the recipe gets complicated, e.g., needs a docker image for a command-line tool, or one parameter takes multiple types, etc. Refer to this recipe as an example: https://github.com/rworkflow/ReUseDataRecipe/blob/master/reference_genome.R

Value

a data recipe in cwlProcess S4 class with all details about the shell script for data processing/curation, inputs, outputs, required tools and corresponding docker files. It is readily taken by getData() to evaluate the shell scripts included and generate the data locally. Find more details with ?Rcwl::cwlProcess.

Examples

## Not run: 
library(Rcwl)
##############
### example 1
##############

script <- "
input=$1
outfile=$2
echo \"Print the input: $input\" > $outfile.txt
"
rcp <- recipeMake(shscript = script,
                  paramID = c("input", "outfile"),
                  paramType = c("string", "string"),
                  outputID = "echoout",
                  outputGlob = "*.txt")
inputs(rcp)
outputs(rcp)
rcp$input <- "Hello World!"
rcp$outfile <- "outfile"
res <- getData(rcp, outdir = tempdir(),
               notes = c("echo", "hello", "world", "txt"),
               showLog = TRUE)
readLines(res$out)

##############
### example 2
##############

shfile <- system.file("extdata", "gencode_transcripts.sh", package = "ReUseData")
readLines(shfile)
rcp <- recipeMake(shscript = shfile,
                  paramID = c("species", "version"),
                  paramType = c("string", "string"),
                  outputID = "transcripts", 
                  outputGlob = "*.transcripts.fa*",
                  requireTools = c("wget", "gzip", "samtools")
                  )
Rcwl::inputs(rcp)
rcp$species <- "human"
rcp$version <- "42"
res <- getData(rcp,
        outdir = tempdir(), 
        notes = c("gencode", "transcripts", "human", "42"),
        showLog = TRUE)
res$output
dir(tempdir())

## End(Not run)

rworkflow/ReUseData documentation built on Dec. 7, 2023, 11 p.m.