buildBackgroundModel: Build background models for DAU tests

View source: R/buildBackgroundModel.R

buildBackgroundModelR Documentation

Build background models for DAU tests

Description

A method used to build background models for testing differential amino acid usage

Usage

buildBackgroundModel(
  dagPeptides,
  background = c("wholeProteome", "inputSet", "nonInputSet"),
  model = c("any", "anchored"),
  targetPosition = c("any", "Nterminus", "Cterminus"),
  uniqueSeq = FALSE,
  numSubsamples = 300L,
  rand.seed = 1,
  replacement = FALSE,
  testType = c("ztest", "fisher"),
  proteome
)

Arguments

dagPeptides

An object of dagPeptides-class containing peptide sequences as the input set.

background

A character vector with options: "wholeProteome", "inputSet", and "nonInputSet", indicating what set of peptide sequences should be considered to generate a background model.

model

A character vector with options: "any" and "anchored", indicating whether an anchoring position should be applied to generate a background model.

targetPosition

A character vector with options: "any", "Nterminus" and "Cterminus", indicating which part of protein sequences of choice should be used to generate a background model.

uniqueSeq

A logical vector indicating whether only unique peptide sequences are included in a background model for sampling.

numSubsamples

An integer, the number of random sampling.

rand.seed

An integer, the seed used to perform random sampling

replacement

A logical vector of length 1, indicating whether replacement is allowed for random sampling.

testType

A character vector of length 1. Available options are "ztest" and "fisher".

proteome

An object of Proteome, output of prepareProteome

Details

The background could be generated from wholeProteome, inputSet or nonInputSet. Case 1: If background ="wholeProteome" and model = "any": The background set is composed of randomly selected subsequences from the wholeProteome with each subsequence of the same length as input sequences.

Case 2: If background ="wholeProteome and model = "anchored": The background set is composed of randomly selected subsequences from the wholeProteome with each subsequence of same length as input sequences.Additionally, the amino acids at the anchoring positions must be the same amino acid as that defined in the dagPeptides object,such as "K" for lysine.

Case 3: If background ="inputSet" and model = "any": similar to Case 1, but the full length protein sequences matching the protein sequence IDs in the inputSet are used for build background model after excluding the subsequences specified in the inputSet from the full length sequences.

Case 4: If background ="inputSet" and model = "anchored": similar to Case 2, but the full-length protein sequences matching the protein sequence IDs in the inputSet are used for build background model after excluding the subsequences specified in the inputSet from the full length sequences.

Case 5: If background ="nonInputSet" and model = "any": The background set is composed of randomly selected subsequences from the wholeProteome, not including the sequences corresponding to the inpuSet sequencesm with each subsequence of same length as input sequences.

Case 6: If background ="nonInputSet" and model = "anchored": similar to Case 5, but the amino acids at the anchoring positions must be the same amino acid as that defined in the dagPeptides object, such as "K" for lysine.

Value

An object of dagBackground-class.

Author(s)

Jianhong Ou, Haibo Liu

Examples

dat <- unlist(read.delim(system.file(
                                   "extdata", "grB.txt", package = "dagLogo"),
                         header = FALSE, as.is = TRUE))
##prepare an object of Proteome Class from a fasta file
proteome <- prepareProteome(fasta = system.file("extdata",
                                                "HUMAN.fasta",
                                                package = "dagLogo"), 
                            species = "Homo sapiens")
                            
##prepare an object of dagPeptides Class
seq <- formatSequence(seq = dat, proteome = proteome, upstreamOffset = 14,
                     downstreamOffset = 15)
bg_fisher <- buildBackgroundModel(seq, background = "wholeProteome", 
                                  proteome = proteome, testType = "fisher")
bg_ztest <- buildBackgroundModel(seq, background = "wholeProteome",
                                   proteome = proteome, testType = "ztest")

jianhong/dagLogo documentation built on Nov. 5, 2024, 7:46 a.m.