Description Usage Arguments Details Value Examples
View source: R/stratified_model.R
Should be used after fitting interaction_model
, and only
for triplet data with significant TF*DNAm
interaction. This analysis
examines in more details on how TF activities differ in
samples with high DNAm or low DNAm values.
1 2 3 4 5 6 7 8 | stratified_model(
triplet,
dnam,
exp,
cores = 1,
tf.activity.es = NULL,
tf.dnam.classifier.pval.thld = 0.001
)
|
triplet |
Data frame with columns for DNA methylation region (regionID), TF (TF), and target gene (target) |
dnam |
DNA methylation matrix or SummarizedExperiment
(columns: samples in the same order as |
exp |
A matrix or SummarizedExperiment
(columns: samples in the same order as |
cores |
Number of CPU cores to be used. Default 1. |
tf.activity.es |
A matrix with normalized enrichment scores for each TF across all samples to be used in linear models instead of TF gene expression. |
tf.dnam.classifier.pval.thld |
P-value threshold to consider a linear model significant of not. Default 0.001. This will be used to classify the TF role and DNAm effect. |
This function fits linear model
log2(RNA target) = log2(TF)
to samples with highest DNAm values (top 25 percent) or lowest DNAm values (bottom 25 percent), separately.
There are two implementations of these models, depending on whether there are an excessive amount (i.e. more than 25 percent) of samples with zero counts in RNAseq data:
When percent of zeros in RNAseq data is less than
25 percent, robust linear models are implemented using rlm
function from MASS
package. This
gives outlier gene expression values reduced weight. We used "psi.bisqure"
option in function rlm
(bisquare weighting,
https://stats.idre.ucla.edu/r/dae/robust-regression/).
When percent of zeros in RNAseq data is more than 25 percent,
zero inflated negative binomial models
are implemented using zeroinfl
function from pscl
package. This assumes there are
two processes that generated zeros (1) one where the counts are always zero
(2) another where the count follows a negative binomial distribution.
To account for confounding effects from covariate variables,
first use the get_residuals
function to obtain
RNA residual values which have covariate effects removed,
then fit interaction model. Note that no
log2 transformation is needed when interaction_model
is applied to residuals data.
This function also provides annotations for TFs. A TF is annotated as
activator
if
increasing amount of TF (higher TF gene expression) corresponds to
increased target gene expression. A TF
is annotated as repressor
if increasing amount of TF
(higher TF gene expression) corresponds to
decrease in target gene expression.
A TF is annotated as dual
if in the Q1 methylation group increasing
amount of TF (higher TF gene expression) corresponds to
increase in target gene expression, while in Q4 methylation group increasing
amount of TF (higher TF gene expression) corresponds to
decrease in target gene expression
(or the same but changing Q1 and Q4 in the previous sentence).
In addition, a region/CpG is annotated as enhancing
if more
TF regulation on gene transcription
is observed in samples with high DNAm. That is, DNA methylation
enhances TF regulation on target gene expression.
On the other hand, a region/CpG is annotated as attenuating
if more TF regulation on gene
transcription is observed in samples with low DNAm.
That is, DNA methylation reduces TF regulation
on target gene expression.
A dataframe with Region, TF, target, TF_symbol target_symbol
,
results for
fitting linear models to samples with low methylation
(DNAmlow_pval_rna.tf
, DNAmlow_estimate_rna.tf
),
or samples with high methylation (DNAmhigh_pval_rna.tf
,
DNAmhigh_pval_rna.tf.1
), annotations for TF (class.TF
)
and (class.TF.DNAm
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | library(dplyr)
dnam <- runif (20,min = 0,max = 1) %>%
matrix(ncol = 1) %>% t
rownames(dnam) <- c("chr3:203727581-203728580")
colnames(dnam) <- paste0("Samples",1:20)
exp.target <- runif (20,min = 0,max = 10) %>%
matrix(ncol = 1) %>% t
rownames(exp.target) <- c("ENSG00000232886")
colnames(exp.target) <- paste0("Samples",1:20)
exp.tf <- runif (20,min = 0,max = 10) %>%
matrix(ncol = 1) %>% t
rownames(exp.tf) <- c("ENSG00000232888")
colnames(exp.tf) <- paste0("Samples",1:20)
exp <- rbind(exp.tf, exp.target)
triplet <- data.frame(
"regionID" = c("chr3:203727581-203728580"),
"target" = "ENSG00000232886",
"TF" = "ENSG00000232888"
)
results <- stratified_model(
triplet = triplet,
dnam = dnam,
exp = exp
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.