runMeme | R Documentation |
MEME performs de-novo discovery of ungapped motifs present in the input sequences. It can be used in both discriminative and non-discriminative modes.
runMeme(
input,
control = NA,
outdir = "auto",
alph = "dna",
parse_genomic_coord = TRUE,
combined_sites = FALSE,
silent = TRUE,
meme_path = NULL,
...
)
## S3 method for class 'list'
runMeme(
input,
control = NA,
outdir = "auto",
alph = "dna",
parse_genomic_coord = TRUE,
combined_sites = FALSE,
silent = TRUE,
meme_path = NULL,
...
)
## S3 method for class 'BStringSetList'
runMeme(
input,
control = NA,
outdir = "auto",
alph = "dna",
parse_genomic_coord = TRUE,
combined_sites = FALSE,
silent = TRUE,
meme_path = NULL,
...
)
## Default S3 method:
runMeme(
input,
control = NA,
outdir = "auto",
alph = "dna",
parse_genomic_coord = TRUE,
combined_sites = FALSE,
silent = TRUE,
meme_path = NULL,
...
)
input |
path to fasta, Biostrings::BStringSet list, or list of
Biostrings::BStringSet (can generate using |
control |
any data type as in |
outdir |
(default: "auto") Directory where output data will be stored. |
alph |
one of c("dna", "rna", "protein") or path to alphabet file (default: "dna"). |
parse_genomic_coord |
|
combined_sites |
|
silent |
Whether to suppress printing stdout to terminal (default: TRUE) |
meme_path |
path to "meme/bin/". If unset, will use default search behavior:
|
... |
additional arguments passed to MEME (see below) |
Note that MEME can take a long time to run. The more input sequences used, the wider the motifs searched for, and the more motifs MEME is asked to discover will drastically affect runtime. For this reason, MEME usually performs best on a few (<50) short (100-200 bp) sequences, although this is not a requirement. Additional details on how data size affects runtime can be found here.
MEME works best when specifically tuned to the analysis question. The default
settings are unlikely to be ideal. It has several complex arguments
documented here, which runMeme()
accepts as R function arguments (see details below).
If discovering motifs within ChIP-seq, ATAC-seq, or similar peaks, MEME may perform
best if using sequences flaking the summit (the site of maximum signal) of
each peak rather than the center. ChIP-seq or similar data can also benefit
from setting revcomp = TRUE, minw = 5, maxw = 20
. For more tips on using
MEME to analyze ChIP-seq data, see the following
tips page.
runMeme()
accepts all valid arguments to meme as arguments passed to ...
.
For flags without values, pass them as flag = TRUE
. The dna
, rna
, and
protein
flags should instead be passed to the alph
argument of
runMeme()
. The arguments passed to MEME often have many interactions
with each other, for a detailed description of each argument see
MEME Commandline Documentation.
MEME results in universalmotif_df format (see:
universalmotif::to_df()
). sites_hits
is a nested data.frame
column containing the position within each input sequence of matches to the
identified motif.
If you use runMeme()
in your analysis, please cite:
Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. pdf
The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.
if (meme_is_installed()) {
seqs <- universalmotif::create_sequences("CCRAAAW", seqnum = 4)
names(seqs) <- 1:length(seqs)
runMeme(seqs, parse_genomic_coord = FALSE)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.