methReg_analysis: Wrapper for MethReg functions
In TransBioInfoLab/MethReg: Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

methReg_analysis

R Documentation

Wrapper for MethReg functions

Description

Wrapper for the following MethReg functions: 1) DNAm vs Target gene spearman correlation 2) TF vs Target gene spearman correlation 3) interaction_model 4) stratified model

Usage

methReg_analysis(
  triplet,
  dnam,
  exp,
  tf.activity.es = NULL,
  dnam.group.percent.threshold = 0.25,
  perform.correlation.analaysis = TRUE,
  remove.nonsig.correlated.dnam.target.gene = FALSE,
  remove.nonsig.correlated.dnam.target.gene.threshold.pvalue = 0.01,
  remove.nonsig.correlated.dnam.target.gene.threshold.estimate = 0.2,
  remove.sig.correlated.tf.exp.dnam = TRUE,
  filter.triplet.by.sig.term = TRUE,
  filter.triplet.by.sig.term.using.fdr = TRUE,
  filter.triplet.by.sig.term.pvalue.threshold = 0.05,
  multiple.correction.by.stage.wise.analysis = TRUE,
  tf.dnam.classifier.pval.threshold = 0.001,
  verbose = FALSE,
  cores = 1
)

Arguments

`triplet`	Data frame with columns for DNA methylation region (regionID), TF (TF), and target gene (target)
`dnam`	DNA methylation matrix or SummarizedExperiment object (columns: samples in the same order as `exp` matrix, rows: regions/probes)
`exp`	A matrix or SummarizedExperiment object object (columns: samples in the same order as `dnam`, rows: genes represented by ensembl IDs (e.g. ENSG00000239415))
`tf.activity.es`	A matrix with normalized enrichment scores for each TF across all samples to be used in linear models instead of TF gene expression. See `get_tf_ES`.
`dnam.group.percent.threshold`	DNA methylation threshold percentage to define samples in the low methylated group and high methylated group. For example, setting the threshold to 0.3 (30%) will assign samples with the lowest 30% methylation in the low group and the highest 30% methylation in the high group. Default is 0.25 (25%), accepted threshold range (0.0,0.5].
`perform.correlation.analaysis`	Perform correlation analysis ?
`remove.nonsig.correlated.dnam.target.gene`	If spearman correlation of target expression and DNAm for all samples is not significant (pvalue > 0.05), triplet will be removed If wilcoxon test of target expression Q1 and Q4 is not significant (pvalue > 0.05), triplet will be removed.
`remove.nonsig.correlated.dnam.target.gene.threshold.pvalue`	Cut-off for remove.nonsig.correlated.dnam.target.gene in the spearman test
`remove.nonsig.correlated.dnam.target.gene.threshold.estimate`	Cut-off for remove.nonsig.correlated.dnam.target.gene in the spearman test
`remove.sig.correlated.tf.exp.dnam`	If wilcoxon test of TF expression Q1 and Q4 is significant (pvalue < 0.05), triplet will be removed.
`filter.triplet.by.sig.term`	Filter significant triplets ? Select triplets if any term is significant 1) interaction (TF x DNAm) p-value < 0.05 or 2) DNAm p-value < 0.05 or 3) TF p-value < 0.05 in binary model
`filter.triplet.by.sig.term.using.fdr`	Uses FRD instead of p-value when using filter.triplet.by.sig.term.
`filter.triplet.by.sig.term.pvalue.threshold`	P-values/FDR Threshold to filter significant triplets.
`multiple.correction.by.stage.wise.analysis`	A boolean indicating if stagewise analysis should be performed to correct for multiple comparisons. If set to FALSE then FDR analysis is performed.
`tf.dnam.classifier.pval.threshold`	P-value threshold to consider a linear model significant of not. Default 0.001. This will be used to classify the TF role and DNAm effect.
`verbose`	A logical argument indicating if messages output should be provided.
`cores`	Number of CPU cores to be used. Default 1.

Details

This function fits the linear model

log2(RNA target) ~ log2(TF) + DNAm + log2(TF) * DNAm

to triplet data as follow:

Model by considering DNAm as a binary variable - we defined a binary group for DNA methylation values (high = 1, low = 0). That is, samples with the highest DNAm levels (top 25 percent) has high = 1, samples with lowest DNAm levels (bottom 25 percent) has high = 0. Note that in this implementation, only samples with DNAm values in the first and last quartiles are considered.

In these models, the term log2(TF) evaluates direct effect of TF on target gene expression, DNAm evaluates direct effect of DNAm on target gene expression, and log2(TF)*DNAm evaluates synergistic effect of DNAm and TF, that is, if TF regulatory activity is modified by DNAm.

There are two implementations of these models, depending on whether there are an excessive amount (i.e. more than 25 percent) of samples with zero counts in RNAseq data:

When percent of zeros in RNAseq data is less than 25 percent, robust linear models are implemented using rlm function from MASS package. This gives outlier gene expression values reduced weight. We used "psi.bisqure" option in function rlm (bisquare weighting, https://stats.idre.ucla.edu/r/dae/robust-regression/).
When percent of zeros in RNAseq data is more than 25 percent, zero inflated negative binomial models are implemented using zeroinfl function from pscl package. This assumes there are two processes that generated zeros (1) one where the counts are always zero (2) another where the count follows a negative binomial distribution.

To account for confounding effects from covariate variables, first use the get_residuals function to obtain RNA or DNAm residual values which have covariate effects removed, then fit interaction model. Note that no log2 transformation is needed when interaction_model is applied to residuals data.

Note that only triplets with TF expression not significantly different in high vs. low methylation groups will be evaluated (Wilcoxon test, p > 0.05).

Value

A dataframe with Region, TF, target, TF_symbo, target_symbol, estimates and P-values, after fitting robust linear models or zero-inflated negative binomial models (see Details above).

Model considering DNAm values as a binary variable generates quant_pval_metGrp, quant_pval_rna.tf, quant_pval_metGrp.rna.tf, quant_estimates_metGrp, quant_estimates_rna.tf, quant_estimates_metGrp.rna.tf.

Model.interaction indicates which model (robust linear model or zero inflated model) was used to fit Model 1, and Model.quantile indicates which model(robust linear model or zero inflated model) was used to fit Model 2.

TransBioInfoLab/MethReg documentation built on July 28, 2023, 9:17 p.m.

TransBioInfoLab/MethReg index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

TransBioInfoLab/MethReg
Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

methReg_analysis: Wrapper for MethReg functions
In TransBioInfoLab/MethReg: Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

Wrapper for MethReg functions

Description

Usage

Arguments

Details

Value

Related to methReg_analysis in TransBioInfoLab/MethReg...

R Package Documentation

Browse R Packages

We want your feedback!

TransBioInfoLab/MethReg Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

methReg_analysis: Wrapper for MethReg functions In TransBioInfoLab/MethReg: Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

Wrapper for MethReg functions

Description

Usage

Arguments

Details

Value

Related to methReg_analysis in TransBioInfoLab/MethReg...

R Package Documentation

Browse R Packages

We want your feedback!

TransBioInfoLab/MethReg
Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

methReg_analysis: Wrapper for MethReg functions
In TransBioInfoLab/MethReg: Assessing the regulatory potential of DNA methylation regions or sites on gene transcription