Description Usage Arguments Value Author(s) References See Also Examples
The function performs the Rank Product (or Rank Sum) method to identify differentially expressed genes. It is possible to do either a one-class or two-class analysis. It is also possible to combine data from different studies (e.g. datasets generated by different laboratories)
1 2 3 |
data |
the data set that should be analyzed. Every row of this dataset must correspond to a gene |
cl |
a vector containing the class labels of the samples. In the two class unpaired case, the label of a sample is either 0 (e.g., control group) or 1 (e.g., case group). For one class data, the label for each sample should be 1 |
origin |
a vector containing the origin labels of the samples. The label is the same for samples within one lab and different for samples from different labs. |
logged |
if "TRUE" data have been previously log transformed. Otherwise it should be set as "FALSE" |
na.rm |
if "FALSE", the NA value will not be used in computing rank. If "TRUE" (default), the missing values will be replaced by the genewise median of the non-missing values. Gene with a number of missing values greater than "MinNumOfValidPairs" are still not considered in the analysis |
gene.names |
if "NULL", no gene name will be attached to the outputs, otherwise it contains the vector of gene names |
plot |
if "TRUE", plot the estimated pfp vs the rank of each gene |
rand |
if specified, the random number generator will be put in a reproducible state |
calculateProduct |
if calculateProduct="TRUE" (default) the rank product method is performed. Otherwise the rank sum method is performed |
MinNumOfValidPairs |
a parameter that indicates the minimum number of NAs accepted per each gene. If it is set to NA (default) the half of the number of replicates is used |
RandomPairs |
number of random pairs generated in the function, if set to NA (default), the odd integer closer to the square of the number of replicates is used |
huge |
if "TRUE" not all the outputs are evaluated in order to save space |
fast |
if "FALSE" the exact p-values for the Rank Sum are evaluated for any size of the dataset. Otherwise (default), if the size of the dataset is too big, only the p-values that can be computed in "tail.time" minutes (starting from the tail) are evaluated with the exact method. The others are estimated with the Gaussian approximation. If calculateProduct="TRUE" this parameter is ignored |
tail.time |
the time (default 0.05 min) dedicated to evaluate the exact p-values for the Rank Sum.If calculateProduct="TRUE" this parameter is ignored. |
A summary of the results obtained by the Rank Product (or Rank Sum) method.
pfp |
estimated percentage of false positive predictions (pfp), both considering upregulated an downregulated genes |
pval |
estimated pvalues per each gene being up- and down-regulated |
RPs/RSs |
the Rank Product (or Rank Sum) statistics evaluated per each gene |
RPrank/RSrank |
rank of the Rank Product (or Rank Sum) of each gene in ascending order |
Orirank |
ranks obtained when considering each possible pairing. In this version of the package, this is not used to compute Rank Product (or Rank Sum), but it is kept for backward compatibility |
AveFC |
fold changes of average expressions (class1/class2). log fold-change if data has been log transformed, original fold change otherwise |
allrank1 |
fold change of class 1/class 2 under each origin. log fold-change if data has been log transformed, original fold change otherwise |
allrank2 |
fold change of class 2/class 1 under each origin. log fold-change if data has been log transformed, original fold change otherwise |
nrep |
total number of replicates |
groups |
vector of labels (as cl) |
RandomPairs_ranks |
a matrix containing the ranks evaluated for each RandomPair |
Francesco Del Carratore,
francesco.delcarratore@postgrad.manchester.ac.uk
Andris Janckevics, andris.jankevics@gmail.com
Breitling, R., Armengaud, P., Amtmann, A., and Herzyk, P.(2004) Rank Products: A simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments, FEBS Letter, 57383-92
topGene
RP
RPadvance
plotRP
RankProducts
RSadvance
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | # Load the data of Golub et al. (1999). data(golub)
# contains a 3051x38 gene expression
# matrix called golub, a vector of length called golub.cl
# that consists of the 38 class labels,
# and a matrix called golub.gnames whose third column
# contains the gene names.
data(golub)
##For data with single origin
subset <- c(1:4,28:30)
origin <- rep(1,7)
#identify genes
RP.out <- RP.advance(golub[,subset],golub.cl[subset],
origin,plot=FALSE,rand=123)
#For data from multiple origins
# Load the data arab in the package, which contains
# the expression of 22,081 genes
# of control and treatment group from the experiments
# indenpently conducted at two
#laboratories.
data(arab)
arab.origin #1 1 1 1 1 1 2 2 2 2
arab.cl #0 0 0 1 1 1 0 0 1 1
RP.adv.out <- RP.advance(arab,arab.cl,arab.origin,
gene.names=arab.gnames,logged=TRUE,rand=123)
attributes(RP.adv.out)
head(RP.adv.out$pfp)
head(RP.adv.out$RPs)
head(RP.adv.out$AveFC)
#Suppose we want to check the consistence of the data
#sets generated in two different
#labs. For example, we would look for genes that were \
# measured to be up-regulated in
#class 2 at lab 1, but down-regulated in class 2 at lab 2.\
data(arab)
arab.cl2 <- arab.cl
arab.cl2[arab.cl==0 &arab.origin==2] <- 1
arab.cl2[arab.cl==1 &arab.origin==2] <- 0
arab.cl2
##[1] 0 0 0 1 1 1 1 1 0 0
#look for genes differentially expressed
#between hypothetical class 1 and 2
arab.sub=arab[1:500,] ##using subset for fast computation
arab.gnames.sub=arab.gnames[1:500]
Rsum.adv.out <- RP.advance(arab.sub,arab.cl2,arab.origin,calculateProduct
=FALSE,logged=TRUE,gene.names=arab.gnames.sub,rand=123)
attributes(Rsum.adv.out)
|
Loading required package: Rmpfr
Loading required package: gmp
Attaching package: 'gmp'
The following objects are masked from 'package:base':
%*%, apply, crossprod, matrix, tcrossprod
C code of R package 'Rmpfr': GMP using 64 bits per limb
Attaching package: 'Rmpfr'
The following object is masked from 'package:gmp':
outer
The following objects are masked from 'package:stats':
dbinom, dnorm, dpois, pnorm
The following objects are masked from 'package:base':
cbind, pmax, pmin, rbind
The data is from 1 different origins
Rank Product analysis for two-class case
Rank Product analysis for unpaired case
done [1] 1 1 1 1 1 1 2 2 2 2
[1] 0 0 0 1 1 1 0 0 1 1
The data is from 2 different origins
Rank Product analysis for two-class case
Rank Product analysis for unpaired case
Rank Product analysis for unpaired case
done $names
[1] "RPs" "RPrank" "pfp"
[4] "pval" "AveFC" "groups"
[7] "RandomPairs_ranks" "nrep" "allrank1"
[10] "allrank2" "Orirank"
class1 < class2 class1 > class2
244901_at 1.0677795 1.049335
244902_at 1.0868398 1.036828
244903_at 1.1766396 1.061817
244904_at 1.0832921 1.180194
244905_at 1.0667289 1.076800
244906_at 0.9659401 1.046720
class1 < class2 class1 > class2
244901_at 248.39049 166.5156
244902_at 225.03952 247.2433
244903_at 159.07172 175.0358
244904_at 228.57455 128.4567
244905_at 247.09183 191.5251
244906_at 96.46004 362.4832
[,1]
244901_at 0.060713984
244902_at -0.005148036
244903_at -0.008684345
244904_at 0.128771916
244905_at 0.030076779
244906_at -0.217625131
[1] 0 0 0 1 1 1 1 1 0 0
The data is from 2 different origins
Rank Sum analysis for two-class case
Rank Sum analysis for unpaired case
Rank Sum analysis for unpaired case
done $names
[1] "RSs" "RSrank" "pfp"
[4] "pval" "AveFC" "groups"
[7] "RandomPairs_ranks" "nrep" "allrank1"
[10] "allrank2" "Orirank"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.