findSeedMatches | R Documentation |
'findSeedMatches' takes a set of sequences and a set of miRNAs (given either
as target seeds, mature miRNA sequences, or a KdModelList
).
findSeedMatches(
seqs,
seeds,
shadow = 0L,
onlyCanonical = FALSE,
maxLogKd = c(-1, -1.5),
keepMatchSeq = FALSE,
minDist = 7L,
p3.extra = FALSE,
p3.params = list(maxMirLoop = 7L, maxTargetLoop = 9L, maxLoopDiff = 4L, mismatch =
TRUE, GUwob = TRUE),
agg.params = .defaultAggParams(),
ret = c("GRanges", "data.frame", "aggregated"),
BP = NULL,
verbose = NULL,
n_seeds = NULL,
useTmpFiles = FALSE,
keepTmpFiles = FALSE
)
seqs |
A character vector or 'DNAStringSet' of DNA sequences in which to look. |
seeds |
A character vector of 7-nt seeds to look for. If RNA, will be reversed and complemented before matching. If DNA, they are assumed to be the target sequence to look for. Alternatively, a list of objects of class 'KdModel' or an object of class 'KdModelList' can be given. |
shadow |
Integer giving the shadow, i.e. the number of nucleotides hidden at the beginning of the sequence (default 0). |
onlyCanonical |
Logical; whether to restrict the search only to canonical binding sites. |
maxLogKd |
Maximum log_kd value to keep. This has a major impact on the number of sites returned, and hence on the memory requirements. Set to Inf to disable (_not_ recommended when running large scans!). |
keepMatchSeq |
Logical; whether to keep the sequence (including flanking dinucleotides) for each seed match (default FALSE). |
minDist |
Integer specifying the minimum distance between matches of the same miRNA (default 7). Closer matches will be reduced to the highest-affinity. To disable the removal of overlapping features, use 'minDist=-Inf'. |
p3.extra |
Logical; whether to keep extra information about 3' alignment. Disable (default) this when running large scans, otherwise you might hit your system's memory limits. |
p3.params |
Named list of parameters for 3' alignment with slots 'maxMirLoop' (integer, default = 7), 'maxTargetLoop' (integer, default = 9), 'maxLoopDiff' (integer, default = 4), 'mismatch' (logical, default = TRUE) and 'GUwob' (logical, default = TRUE). |
agg.params |
A named list with slots 'a', 'b', 'c', 'p3', 'coef_utr', 'coef_orf' and 'keepSiteInfo' indicating the parameters for the aggregation. Ignored if 'ret!="aggregated"'. For further details see documentation of 'aggregateMatches'. |
ret |
The type of data to return, either "GRanges" (default), "data.frame", or "aggregated" (aggregates affinities/sites for each seed-transcript pair). |
BP |
Pass 'BiocParallel::MulticoreParam(ncores, progressbar=TRUE)' to enable multithreading. |
verbose |
Logical; whether to print additional progress messages (default on if not multithreading) |
n_seeds |
Integer; the number of seeds that are processed in parallel to avoid memory issues. |
useTmpFiles |
Logical; whether to write results for single miRNAs in temporary files (ignored when scanning for a single seed). Alternatively, 'useTmpFiles' can be a character vector of length 1 indicating the path to the directory in which to write temporary files. |
keepTmpFiles |
Logical; whether to keep the temporary files at the end of the process; ignored if 'useTmpFiles=FALSE'. Temporary files are removed only upon successful completion of the function, meaning that they will not be deleted in case of errors. |
A GRanges of all matches. If 'seeds' is a 'KdModel' or 'KdModelList', the 'log_kd' column will report the ln(Kd) multiplied by 1000, rounded and saved as an integer. If 'ret!="GRanges', returns a data.frame.
# we create mock RNA sequences and seeds:
seqs <- getRandomSeq(n=10)
seeds <- c("AAACCAC", "AAACCUU")
findSeedMatches(seqs, seeds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.