GEM_GxEmodel: GEM_GxEmodel

Description Usage Arguments Details Value Examples

View source: R/GEM_model.R

Description

GEM_GxEmodel tests the ability of the interaction of gene and environmental factor to predict DNA methylation level.

Usage

1
2
GEM_GxEmodel(snp_file_name, covariate_file_name, methylation_file_name,
  GxEmodel_pv, output_file_name, topKplot = 10)

Arguments

snp_file_name

Text file with rows representing genotype encoded as 1,2,3 or any three distinct values for major allele homozygote (AA), heterozygote (AB) and minor allele homozygote (BB) and columns representing samples, such as the example data file "snp.txt".

covariate_file_name

Text file with rows representing covariate factors and the envirnoment value, and the environment value should be put in the last row, and columns representing samples, such as the example data file "gxe.txt".

methylation_file_name

Text file with rows representing methylation profiles for CpGs, and columns representing samples, such as the example data file "methylation.txt".

GxEmodel_pv

The pvalue cut off. Associations with significances at GxEmodel_pv level or below are saved to output_file_name, with corresponding estimate of effect size (slope coefficient), test statistics and p-value. Default value is 5.0E-08.

output_file_name

The result file with each row presenting a CpG and its association with SNPxEnv, which contains CpGID, SNPID, estimate of effect size (slope coefficient), test statistics, pvalue and FDR at each column.

topKplot

The top number of topKplot CpG-SNP-Env triplets will be presented into charts to demonstrate how environment values segregated by SNP groups can explain methylation.

Details

GEM_GxEmodel explores how the genotype can work in interaction with environment (GxE) to influence specific DNA methylation level, by performing matrix based iterative correlation and memory-efficient data analysis among methylation, genotyping and environment. This has greatly released the computational burden for GxE study from billions of linear regression (N = number_of_CpGs x number_of_SNPs x number_of_environment) and made it possible to be accomplished in an efficient way. The linear regression is lm (M ~ G x E + covt), where M is a matrix with methylation data, G is a matrix with genotype data, E is environment data and covt is covariate matrix. E values is combined into covariate file as the last row, and all read from the formatted text data file. The output of GEM_GxEmodel is a list of CpG-SNP-Env triplets, where the environment factor segregated in genotype group fits to explain the particular CpG. The significant association suggests the association between methylation and environment can be better explained by segregation in genotype groups (GxE).

Value

save results automatically

Examples

1
2
3
4
5
6
7
8
DATADIR = system.file('extdata',package='GEM')
RESULTDIR = getwd()
snp_file = paste(DATADIR, "snp.txt", sep = .Platform$file.sep)
covariate_file = paste(DATADIR, "gxe.txt", sep = .Platform$file.sep)
methylation_file = paste(DATADIR, "methylation.txt", sep = .Platform$file.sep)
GxEmodel_pv = 1e-4
output_file = paste(RESULTDIR, "Result_GxEmodel.txt", sep = .Platform$file.sep)
GEM_GxEmodel(snp_file, covariate_file, methylation_file, GxEmodel_pv, output_file)

panhongNTU/GEM documentation built on May 24, 2019, 6:14 p.m.