check_n_num: Ensure all SNPs have N less than X std dev below mean

View source: R/check_n_num.R

check_n_numR Documentation

Ensure all SNPs have N less than X std dev below mean

Description

In case some SNPs were genotyped by a specialized genotyping array and have substantially more samples than others. These will be removed.

Usage

check_n_num(
  sumstats_dt,
  path,
  N_std,
  N_dropNA = FALSE,
  log_folder_ind,
  check_save_out,
  tabix_index,
  nThread,
  log_files
)

Arguments

path

Filepath for the summary statistics file to be formatted. A dataframe or datatable of the summary statistics file can also be passed directly to MungeSumstats using the path parameter.

N_std

numeric The number of standard deviations above the mean a SNP's N is needed to be removed. Default is 5.

N_dropNA

Drop rows where N is missing.Default is TRUE.

log_folder_ind

Binary Should log files be stored containing all filtered out SNPs (separate file per filter). The data is outputted in the same format specified for the resulting sumstats file. The only exception to this rule is if output is vcf, then log file saved as .tsv.gz. Default is FALSE.

tabix_index

Index the formatted summary statistics with tabix for fast querying.

nThread

Number of threads to use for parallel processes.

log_files

list of log file locations

Value

list containing sumstats_dt, the modified summary statistics data table object and the log file list


neurogenomics/MungeSumstats documentation built on Aug. 10, 2024, 5:59 a.m.