View source: R/get_genome_build.R
get_genome_build | R Documentation |
Infers the genome build of the summary statistics file (GRCh37 or GRCh38) from the data. Uses SNP (RSID) & CHR & BP to get genome build.
get_genome_build(
sumstats,
nThread = 1,
sampled_snps = 10000,
standardise_headers = TRUE,
mapping_file = sumstatsColHeaders,
dbSNP = 155,
header_only = FALSE,
allele_match_ref = FALSE,
ref_genome = NULL,
chr_filt = NULL
)
sumstats |
data table/data frame obj of the summary statistics file for the GWAS ,or file path to summary statistics file. |
nThread |
Number of threads to use for parallel processes. |
sampled_snps |
Downsample the number of SNPs used when inferring genome build to save time. |
standardise_headers |
Run
|
mapping_file |
MungeSumstats has a pre-defined
column-name mapping file
which should cover the most common column headers and their interpretations.
However, if a column header that is in your file is missing of the mapping we
give is incorrect you can supply your own mapping file. Must be a 2 column
dataframe with column names "Uncorrected" and "Corrected". See
|
dbSNP |
version of dbSNP to be used (144 or 155). Default is 155. |
header_only |
Instead of reading in the entire |
allele_match_ref |
Instead of returning the genome_build this will return the propotion of matches to each genome build for each allele (A1,A2). |
ref_genome |
name of the reference genome used for the GWAS ("GRCh37" or "GRCh38"). Argument is case-insensitive. Default is NULL which infers the reference genome from the data. |
chr_filt |
Internal for testing - filter reference genomes and sumstats to specific chromosomes for testing. Pass a list of chroms in format: c("1","2"). Default is NULL i.e. no filtering |
ref_genome the genome build of the data
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.