lfmSingleSequence: Show significant clusters of mutations of every gene in a...

lfmSingleSequenceR Documentation

Show significant clusters of mutations of every gene in a LowMACA object without alignment

Description

The method lfmSingleSequence (low frequency mutations in Single Sequence) launch lfm method on every gene or domain inside a LowMACA object without aligning the sequences

Usage

lfmSingleSequence(object , metric='qvalue', threshold=.05 
, conservation=0.1 
, BPPARAM=bpparam("SerialParam") 
, mail=NULL 
, perlCommand="perl"
,verbose=FALSE)

Arguments

object

a LowMACA class object

metric

a character that defines whether to use 'pvalue' or 'qvalue' to select significant positions. Default: 'qvalue'

threshold

a numeric element between 0 and 1 defining the threshold of significance for the defined metric. Default: 0.05

conservation

a numeric value in the range of 0-1 that defines the threshold of trident conservation score to include the specified position. Default: 0.1

BPPARAM

An object of class BiocParallelParam-class specifiying parameters related to the parallel execution of some of the tasks and calculations within this function. See function register from the BiocParallel package.

mail

if not NULL, it must be a valid email address to use EBI clustalo web service. Default is to use a local clustalo installation

perlCommand

a character string containing the path to Perl executable. if missing, "perl" will be used as default. Only used in web mode

verbose

logical. verbose output or not

Details

This function completes a LowMACA analysis by analyzing every gene or domain in the LowMACA object as a 'single sequence' analysis was started in the first place. The result is a dataframe showing all the significant positions of every gene. If you have a LowMACA object composed by 100 genes, it will launch 100 LowMACA single gene analyses and aggregates the results of every lfm launched on these 100 objects. The output looks very similar to lfm, but in this case the column Multiple_Aln_pos has a different meaning. While in lfm it shows where the mutation falls in the consensus sequence, in this case it must be intended the consensus within the gene. If the original LowMACA object had mode equal to 'gene', the column Multiple_Aln_pos will be always equal to Amino_Acid_Position. If mode is 'pfam', it is the same unless a gene harbors more than one domain of the same type within its sequence. In that case, an internal alignment of every domain inside the protein is performed.

Value

A data.frame with 10 columns corresponding to the mutations retrieved:

  1. Gene_Symbol gene symbols of the analyzed genes

  2. Amino_Acid_Position amino acidic positions relative to original protein

  3. Amino_Acid_Change amino acid changes in hgvs format

  4. Sample Sample barcode where the mutation was found

  5. Tumor_Type Tumor type of the Sample

  6. Envelope_Start start of the pfam domain in the protein

  7. Envelope_End end of the pfam domain in the protein

  8. Multiple_Aln_pos positions in the consensus relatively to the sequence analyzed. See warnings section

  9. Entrez entrez ids of the mutations

  10. Entry Uniprot entry of the protein

  11. UNIPROT other protein names for Uniprot

  12. Chromosome cytobands of the genes

  13. Protein.name extended protein names

Author(s)

Stefano de Pretis , Giorgio Melloni

See Also

lfm

Examples

#Load homeobox example
data(lmObj)
#Run lfmSingleSequence
significant_muts <- lfmSingleSequence(lmObj)
#Show the result 
head(significant_muts)
#Show all the genes that harbor significant mutations without the alignment
unique(significant_muts$Gene_Symbol)

ste-depo/LowMACA documentation built on Oct. 15, 2022, 11:53 p.m.