phenoDisco | R Documentation |
phenoDisco
algorithm.phenoDisco
is a semi-supervised iterative approach to
detect new protein clusters.
phenoDisco(
object,
fcol = "markers",
times = 100,
GS = 10,
allIter = FALSE,
p = 0.05,
ndims = 2,
modelNames = mclust.options("emModelNames"),
G = 1:9,
BPPARAM,
tmpfile,
seed,
verbose = TRUE,
dimred = c("PCA", "t-SNE"),
...
)
object |
An instance of class |
fcol |
A |
times |
Number of runs of tracking. Default is 100. |
GS |
Group size, i.e how many proteins make a group. Default is 10 (the minimum group size is 4). |
allIter |
|
p |
Significance level for outlier detection. Default is 0.05. |
ndims |
Number of principal components to use as input for the disocvery analysis. Default is 2. Added in version 1.3.9. |
modelNames |
A vector of characters indicating the models to
be fitted in the EM phase of clustering using
|
G |
An integer vector specifying the numbers of mixture
components (clusters) for which the BIC is to be
calculated. The default is |
BPPARAM |
Support for parallel processing using the
|
tmpfile |
An optional |
seed |
An optional |
verbose |
Logical, indicating if messages are to be printed out during execution of the algorithm. |
dimred |
A |
... |
Additional arguments passed to the dimensionality
reduction method. For both PCA and t-SNE, the data is scaled
and centred by default, and these parameters ( |
The algorithm performs a phenotype discovery analysis as described in Breckels et al. Using this approach one can identify putative subcellular groupings in organelle proteomics experiments for more comprehensive validation in an unbiased fashion. The method is based on the work of Yin et al. and used iterated rounds of Gaussian Mixture Modelling using the Expectation Maximisation algorithm combined with a non-parametric outlier detection test to identify new phenotype clusters.
One requires 2 or more classes to be labelled in the data and at a
very minimum of 6 markers per class to run the algorithm. The
function will check and remove features with missing values using
the filterNA
method.
A parallel implementation, relying on the BiocParallel
package, has been added in version 1.3.9. See the BPPARAM
arguent for details.
Important: Prior to version 1.1.2 the row order in the output was different from the row order in the input. This has now been fixed and row ordering is now the same in both input and output objects.
An instance of class MSnSet
containing the
phenoDisco
predictions.
Lisa M. Breckels <lms79@cam.ac.uk>
Yin Z, Zhou X, Bakal C, Li F, Sun Y, Perrimon N, Wong ST. Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens. BMC Bioinformatics. 2008 Jun 5;9:264. PubMed PMID: 18534020.
Breckels LM, Gatto L, Christoforou A, Groen AJ, Lilley KS and Trotter MWB. The Effect of Organelle Discovery upon Sub-Cellular Protein Localisation. J Proteomics. 2013 Aug 2;88:129-40. doi: 10.1016/j.jprot.2013.02.019. Epub 2013 Mar 21. PubMed PMID: 23523639.
## Not run:
library(pRolocdata)
data(tan2009r1)
pdres <- phenoDisco(tan2009r1, fcol = "PLSDA")
getPredictions(pdres, fcol = "pd", scol = NULL)
plot2D(pdres, fcol = "pd")
## to pre-process the data with t-SNE instead of PCA
pdres <- phenoDisco(tan2009r1, fcol = "PLSDA", dimred = "t-SNE")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.