Description Usage Arguments Value References See Also Examples
Wrapper function for gene-by-gene association testing of RNA-seq data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | dear_seq(
exprmat = NULL,
object = NULL,
covariates = NULL,
variables2test,
sample_group = NULL,
weights_var2test_condi = TRUE,
cov_variables2test_eff = NULL,
which_test = c("permutation", "asymptotic"),
which_weights = c("loclin", "voom", "none"),
n_perm = 1000,
progressbar = TRUE,
parallel_comp = TRUE,
nb_cores = parallel::detectCores() - 1,
preprocessed = FALSE,
gene_based_weights = FALSE,
bw = "nrd",
kernel = c("gaussian", "epanechnikov", "rectangular", "triangular", "biweight",
"tricube", "cosine", "optcosine"),
exact = FALSE,
transform = TRUE,
padjust_methods = c("BH", "BY", "holm", "hochberg", "hommel", "bonferroni"),
lowess_span = 0.5,
R = NULL,
homogen_traj = FALSE,
na.rm_dearseq = TRUE
)
|
exprmat |
a numeric matrix of size |
object |
an object that can be either a
|
covariates |
If |
variables2test |
|
sample_group |
a vector of length |
weights_var2test_condi |
a logical flag indicating whether
heteroscedasticity weights computation should be conditional on both the
variables to be tested |
cov_variables2test_eff |
a matrix of size |
which_test |
a character string indicating which method to use to
approximate the variance component score test, either |
which_weights |
a character string indicating which method to use to
estimate the mean-variance relationship weights. Possibilities are
|
n_perm |
the number of perturbations. Default is |
progressbar |
logical indicating wether a progressBar should be displayed when computing permutations (only in interactive mode). |
parallel_comp |
a logical flag indicating whether parallel computation
should be enabled. Only Linux and MacOS are supported, this is ignored on
Windows. Default is |
nb_cores |
an integer indicating the number of cores to be used when
|
preprocessed |
a logical flag indicating whether the expression data have
already been preprocessed (e.g. log2 transformed). Default is |
gene_based_weights |
a logical flag used for |
bw |
a character string indicating the smoothing bandwidth selection
method to use. See |
kernel |
a character string indicating which kernel should be used.
Possibilities are |
exact |
a logical flag indicating whether the non-parametric weights
accounting for the mean-variance relationship should be computed exactly or
extrapolated from the interpolation of local regression of the mean against
the variance. Default is |
transform |
a logical flag used for |
padjust_methods |
multiple testing correction method used if
|
lowess_span |
smoother span for the lowess function, between 0 and 1.
This gives the proportion of points in the plot which influence the smooth at
each value. Larger values give more smoothness. Only used if
|
R |
library.size (optional, important to provide if
|
homogen_traj |
a logical flag indicating whether trajectories should be
considered homogeneous. Default is |
na.rm_dearseq |
logical: should missing values in |
A list with the following elements:
which_test
: a character string carrying forward the value of
the 'which_test
' argument indicating which test was perform (either
'asymptotic' or 'permutation').
preprocessed
: a logical flag carrying forward the value of the
'preprocessed
' argument indicating whether the expression data were
already preprocessed, or were provided as raw counts and transformed into
log-counts per million.
n_perm
: an integer carrying forward the value of the
'n_perm
' argument indicating the number of perturbations performed
(NA
if asymptotic test was performed).
genesets
: carrying forward the value of the 'genesets
'
argument defining the gene sets of interest (NULL
for gene-wise
testing).
pval
: computed p-values. A data.frame
with one raw for
each each gene set, or for each gene if genesets
argument is
NULL
, and with 2 columns: the first one 'rawPval
' contains
the raw p-values, the second one contains the FDR adjusted p-values
(according to the 'padjust_methods
' argument) and is named
'adjPval
'.
Gauthier M, Agniel D, ThiƩbaut R & Hejblum BP (2019). dearseq: a variance component score test for RNA-Seq differential analysis that effectively controls the false discovery rate, *bioRxiv* 635714. [DOI: 10.1101/635714v1](https://www.biorxiv.org/content/10.1101/635714v1).
sp_weights
vc_test_perm
vc_test_asym
p.adjust
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | #Monte-Carlo estimation of the proportion of DE genes over `nsims` simulations under the null
#number of runs
nsims <- 2 #100
res <- numeric(nsims)
for(i in 1:nsims){
n <- 1000 #number of genes
nr=5 #number of measurements per subject (grouped data)
ni=50 #number of subjects
r <- nr*ni #number of measurements
t <- matrix(rep(1:nr), ni, ncol=1, nrow=r) # the variable to be tested
sigma <- 0.5
b0 <- 1
#under the null:
b1 <- 0
#create the matrix of gene expression
y.tilde <- b0 + b1*t + rnorm(r, sd = sigma)
y <- t(matrix(rnorm(n*r, sd = sqrt(sigma*abs(y.tilde))), ncol=n, nrow=r) +
matrix(rep(y.tilde, n), ncol=n, nrow=r))
#no covariates
x <- matrix(1, ncol=1, nrow=r)
#run test
#asymptotic test with preprocessed grouped data
res_genes <- dear_seq(exprmat=y, covariates=x, variables2test=t,
sample_group=rep(1:ni, each=nr),
which_test='asymptotic',
which_weights='none', preprocessed=TRUE)
#proportion of raw p-values>0.05
mean(res_genes$pvals[, 'rawPval']>0.05)
#quantiles of raw p-values
quantile(res_genes$pvals[, 'rawPval'])
#proportion of raw p-values<0.05 i.e. proportion of DE genes
res[i] <- mean(res_genes$pvals[, 'rawPval']<0.05)
message(i)
}
#results
mean(res)
if(interactive()){
b0 <- 1
#under the null:
b1 <- 0
#create the matrix of gene expression
y.tilde <- b0 + b1*t + rnorm(r, sd = sigma)
y <- t(matrix(rnorm(n*r, sd = sqrt(sigma*abs(y.tilde))), ncol=n, nrow=r) +
matrix(rep(y.tilde, n), ncol=n, nrow=r))
#run test
#asymptotic test with preprocessed grouped data
res_genes <- dear_seq(exprmat=y, covariates=x, variables2test=t,
sample_group=rep(1:ni, each=nr),
which_weights='none', preprocessed=TRUE)
#results
summary(res_genes$pvals)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.