View source: R/map_rangetype.R
map_rangetype | R Documentation |
map_rangetype
Classifies sequences based on interval mapping against a
reference.
map_rangetype(
map,
type = "percent",
ss = NULL,
min_loop_width = 4,
intervals = list(start = 1:5, mid = 45:55, end = 95:100),
N_include = FALSE
)
map |
PAC_map (generated by |
type |
Character indicating what type of intervals that is provided. If type="nucleotides", then the interval list is given as ranges of nucleotide positions. For example, if interval=list(start=1:3, end=1:3) the function will classify sequences starting within the first three nucleotides of the reference as 'type_start_nuc' and sequences ending in within the last three nucleotides of the reference as 'type_end_nuc'. If type="percent", then intervals needs to be provided as percent ranges. For example, if intervals=list(start=1:5, mid=45:50, end=95:100) then the function will classify sequences starting within the 5 in the references as 'type_start_per', and sequences ending within the 5 last nucleotides of the references as "type_end_per". It will also, classify sequences starting within 45-50 "type_mid_start_per" and sequences ending within 45-50 as 'type_mid_end_per'. If type="ss", then intervals is obtained from an ss file, obtained for example from tRNAscan-SE (http://lowelab.ucsc.edu/tRNAscan-SE/) or at GtRNAdb http://gtrnadb.ucsc.edu/. Importantly - the intervals list is name sensitive. If type="nuclotides", intervals can only contain two intervals named 'start' and 'end', while if type="percent" then intervals needs to contain three intervals named 'start', 'mid' and 'end'. Hint, for classifying 5' and 3' half tsRNA you need to run the function twice. First, classify each sequence as 5'-start or 3'-end tsRNA using type="nucleotides", and then rerun the the map object using type="percent" specifying the 'mid' region as the half interval. |
ss |
File path to ss file (character), readLines vector of ss file
(character) or ss list. If character, the function will attempt to read a
file from the path given in the character string. If this fails, the
function assumes that the ss file has already been read using
|
min_loop_width |
Integer setting the minimum number of nucleotides for a
loop. Only applicable when type="ss". Loops in ss-files are defined by ">"
followed by x number of "." ending with "<". For example: |
intervals |
A named list with integer intervals. |
N_include |
Logical whether or not N "wild card" nucleotides should be
counted in the terminals. This conveniently controls the N_up and N_down
arguments in the |
Given a PAC_map object (PAC_mapper
) and an interval list this
function will attempt to classify mapped sequences based on where these
sequences starts and ends in reference. This function can for example be used
for 5' and 3' tRNA classification.
Map list object containing reference sequence (Ref_seq) as Biostrings::DNAStringSet and the new classifications embedded with the alignments (Alignments) in a dataframe.
https://github.com/Danis102 for updates on the current package.
Other PAC analysis:
PAC_covplot()
,
PAC_deseq()
,
PAC_filter()
,
PAC_filtsep()
,
PAC_gtf()
,
PAC_jitter()
,
PAC_mapper()
,
PAC_nbias()
,
PAC_norm()
,
PAC_pca()
,
PAC_pie()
,
PAC_saturation()
,
PAC_sizedist()
,
PAC_stackbar()
,
PAC_summary()
,
PAC_trna()
,
as.PAC()
,
filtsep_bin()
,
tRNA_class()
###########################################################
### test the map_rangetype function
# More complicated examples can be found in the vignette.
##----------------------------------------
# First create an annotation blank PAC with group means
load(system.file("extdata", "drosophila_sRNA_pac_filt_anno.Rdata",
package = "seqpac", mustWork = TRUE))
anno(pac) <- anno(pac)[,1, drop=FALSE]
pac_trna <- PAC_summary(pac, norm = "cpm", type = "means",
pheno_target=list("stage"), merge_pac = TRUE)
# Then re-annotate only tRNA using the PAC_mapper function
ref <- system.file("extdata/trna", "tRNA.fa",
package = "seqpac", mustWork = TRUE)
map_object <- PAC_mapper(pac_trna, ref=ref, N_up = "NNN", N_down = "NNN",
mismatches=0, threads=2, report_string=TRUE,
override=TRUE)
## Coverage plot of tRNA using PAC_covplot
# Single tRNA targeting a summary table
PAC_covplot(pac_trna, map=map_object,
summary_target= list("cpmMeans_stage"),
map_target="tRNA-Ala-AGC-1-1")
## Classify range types with map_rangetype (see vignette for examples
# on how to use ss-files for detailed tRNA loop structure).
# Classify fragments using percent intervals
map_object <- map_rangetype(map_object,
intervals = list(start = 1:5, mid = 45:55, end = 95:100))
names(map_object)
map_object[[1]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.