Description Usage Arguments Details Value Author(s) References Examples
The "Spaced Words Projection (SWeeP)" is a method for representing biological sequences using compact vectors. SWeeP uses the spacedwords concept by scanning sequences and generating indices to create a higherdimensional matrix of occurrences that is later projected into a smaller randomly oriented orthonormal base (PIERRI, 2019). This way the resulting matrix will conserve the comparational data but will have a selectable size
1 2 3 4 5 6 7 |
xfas |
A AAStringSet or a FASTA format file |
baseMatrix |
A orthonormal matrix with 160.000 coordinates |
The SWeeP method was developed to favor the comparison between
complete proteomic sequences and to assist in machine learning analyzes. This
method is based on the concept of spaced words, which are used to scan
biological sequences and project them into matrix of occurrences, favoring the manipulation
of the data. The sWeeP function can project a matrix n by m, where n is the number of
sequences in the analized xfas
and m is the number of columns
in baseMatrix
matrix.
A matrix resulted by the projection of the sequences in xfas
in the baseMatrix
matrix
Danrley R. Fernandes.
Pierri,C. R. et al. SWeeP: Representing large biological sequences data sets in compact vectors. Scientific Reports, accepted in December 2019.doi: 10.1038/s41598-019-55627-4.
1 2 3 4 5 6 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.