preprocess: A preprocessing method for objects of class GSCA or NWA
In HTSanalyzeR: Gene set over-representation, enrichment and network analyses for high-throughput screens

Description Usage Arguments Details Value Author(s) See Also Examples

This is a generic function.

When implemented as the S4 method for objects of class GSCA or NWA, this function filters out invalid data, removes duplicated genes, converts annotations to Entrez identifiers, etc.

To use this function for objects of class GSCA:

preprocess(object, species="Dm", initialIDs="FlybaseCG", keepMultipleMappings =TRUE, duplicateRemoverMethod="max", orderAbsValue=FALSE, verbose=TRUE)

To use this function for objects of NWA:

preprocess(object, species="Dm", initialIDs="FlybaseCG", keepMultipleMappings =TRUE, duplicateRemoverMethod="max", verbose=TRUE)

1	preprocess(object, ...)

`object`	an object. When this function is implemented as the S4 method of class `GSCA` or `NWA`, this argument is an object of class 'GSCA' or `NWA`.
`...`	other arguments depending on class (see below for the arguments supported by class `GSCA` and/or `NWA`)

species:: a single character value specifying the species for which the data should be read. The current version supports one of the following species: "Dm" ("Drosophila_melanogaster"), "Hs" ("Homo_sapiens"), "Rn" ("Rattus_ norvegicus"), "Mm" ("Mus_musculus"), "Ce" ("Caenorhabditis_elegans"). This is an optional argument here. If it is provided, then the labels of nodes of the identified subnetwork will be mapped from Entrez IDs to gene symbols; otherwise, Entrez IDs will be used as labels for those nodes.
initialIDs:: a single character value specifying the type of initial identifiers for input 'geneList'. Current version can take one of the following types: "Ensembl.transcript", "Ensembl.prot", "Ensembl.gene", "Entrez.gene", "RefSeq", "Symbol" and "GenBank" for all supported species; "Flybase", "FlybaseCG" and "FlybaseProt" in addition for Drosophila Melanogaster; "wormbase" in addition for Caenorhabditis Elegans.
keepMultipleMappings:: a single logical value. If 'TRUE', the function keeps the entries with multiple mappings (first mapping is kept). If 'FALSE', the entries with multiple mappings will be discarded.
duplicateRemoverMethod:: a single character value specifying the method to remove the duplicates (should the minimum, maximum or average observation for a same construct be kept). Current version provides "min" (minimum), "max" (maximum), "average" and "fc.avg" (fold change average). The minimum and maximum should be understood in terms of absolute values (i.e. min/max effect, no matter the sign). The fold change average method converts the fold changes to ratios, averages them and converts the average back to a fold change.
orderAbsValue:: a single logical value indicating whether the values should be converted to absolute values and then ordered (if TRUE), or ordered as they are (if FALSE). This argument is only for class GSCA.
verbose:: a single logical value suggesting to display detailed messages (when verbose=TRUE) or not (when verbose=FALSE)

This function will do the following preprocessing steps:

1:: filter out p-values (the slot pvalues of class NWA), phenotypes (the slot phenotypes of class NWA) and data for enrichment (the slot geneList of class GSCA) with NA values or without valid names, and invalid gene names (the slot hits of class GSCA);
2:: invoke function duplicateRemover to remove duplicated genes in the slot pvalues, phenotypes of class NWA, and the slot geneList and hits of class GSCA;
3:: invoke function annotationConvertor to convert annotations from initialIDs to Entrez identifiers. Please note that the slot hits and the names of the slot geneList of class GSCA, the names of the slot pvalues and the names of the slot phenotypes of class NWA must have the same type of gene annotation specified by initialIDs;
4:: order the data for enrichment decreasingly for objects of class GSCA.

See the function duplicateRemover for more details about how to remove duplicated genes.

See the function annotationConvertor for more details about how to convert annotations.

In the end, this function will return an updated object of class GSCA or NWA.

Xin Wang xw264@cam.ac.uk

duplicateRemover, annotationConvertor

## Not run: 
library(org.Dm.eg.db)
library(KEGG.db)
##load data for enrichment analyses
data("KcViab_Data4Enrich")
##select hits
hits <- names(KcViab_Data4Enrich)[which(abs(KcViab_Data4Enrich) > 2)]
##set up a list of gene set collections
PW_KEGG <- KeggGeneSets(species = "Dm")
gscList <- list(PW_KEGG = PW_KEGG)
##create an object of class 'GSCA'
gsca <- new("GSCA", listOfGeneSetCollections=gscList, geneList =
KcViab_Data4Enrich, hits = hits)
##print gsca
summarize(gsca, what = c("GeneList", "Hits"))
##do preprocessing (KcViab_Data4Enrich has already been preprocessed)
gsca <- preprocess(gsca, species="Dm", initialIDs = "Entrez.gene", 
keepMultipleMappings = TRUE, duplicateRemoverMethod = "max", 
orderAbsValue = FALSE)
##print updated object
summarize(gsca, what = c("GeneList", "Hits"))

## End(Not run)

HTSanalyzeR documentation built on Oct. 31, 2019, 7:10 a.m.

HTSanalyzeR index

Main vignette:Gene set enrichment and network analysis of high-throughput RNAi screen data using HTSanalyzeR

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

HTSanalyzeR
Gene set over-representation, enrichment and network analyses for high-throughput screens

preprocess: A preprocessing method for objects of class GSCA or NWA
In HTSanalyzeR: Gene set over-representation, enrichment and network analyses for high-throughput screens

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to preprocess in HTSanalyzeR...

R Package Documentation

Browse R Packages

We want your feedback!

HTSanalyzeR Gene set over-representation, enrichment and network analyses for high-throughput screens

preprocess: A preprocessing method for objects of class GSCA or NWA In HTSanalyzeR: Gene set over-representation, enrichment and network analyses for high-throughput screens

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to preprocess in HTSanalyzeR...

R Package Documentation

Browse R Packages

We want your feedback!

HTSanalyzeR
Gene set over-representation, enrichment and network analyses for high-throughput screens

preprocess: A preprocessing method for objects of class GSCA or NWA
In HTSanalyzeR: Gene set over-representation, enrichment and network analyses for high-throughput screens