View source: R/findPatternPos.R
findPatternPos | R Documentation |
This function finds all occurrences of a nucleotide pattern in the sequence.
For each occurrence, the function returns the index of the middle
nucleotide, computed as: ceiling(length(pattern) / 2)
. The function
supports data for the plus and minus DNA strands; for the minus strand, all
patterns are turned to complementary sequence.
findPatternPos(patterns, sequence, strand)
patterns |
A list of nucleotide permutations of length |
sequence |
A |
strand |
A character, indicating the plus ( |
This function uses stringi::stri_locate_all_fixed()
.
This function aims to assist with addressing sequence bias in structure probing data. The sequence in the neighbourhood of a nucleotide is assumed to have an effect on its structural state. By considering sequence patterns of a certain length (specified by the user), this function finds indices of the middle nucleotide of each pattern's occurrences within the sequence. We then separately analyse the nucleotides occurring in the middle of each pattern, taking into account sequence dependency.
This function returns a list where each component corresponds to a pattern
(indicated by the field names
) and contains indices of the middle
nucleotides of that pattern's occurrences within the sequence.
The following errors are returned if:
"Strand should be either plus or minus, specified with a sign." strand is not specified as "+" or "-";
"The sequence should be non-empty." provided sequence is empty;
"The list of patterns should be non-empty." the list of patterns to search for in the sequence is empty.
Alina Selega, Sander Granneman, Guido Sanguinetti
Selega et al. "Robust statistical modeling improves sensitivity of high-throughput RNA structure probing experiments", Nature Methods (2016).
See also nuclPerm
.
library(SummarizedExperiment)
## Extract the DNA sequence from se
sequence <- subject(rowData(se)$nucl)
## Generate patterns of length 3
n <- 3
patterns <- nuclPerm(n)
## Find positions of pattern occurrences
nuclPosition <- findPatternPos(patterns, sequence, '+')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.