seqBED2GDS: Conversion between PLINK BED and SeqArray GDS

View source: R/Conversion.R

seqBED2GDSR Documentation

Conversion between PLINK BED and SeqArray GDS

Description

Conversion between PLINK BED format and SeqArray GDS format.

Usage

seqBED2GDS(bed.fn, fam.fn, bim.fn, out.gdsfn, compress.geno="LZMA_RA",
    compress.annotation="LZMA_RA", chr.conv=TRUE, include.pheno=TRUE,
    optimize=TRUE, digest=TRUE, parallel=FALSE, verbose=TRUE)
seqGDS2BED(gdsfile, out.fn, write.rsid=c("auto", "annot_id", "chr_pos_ref_alt"),
    multi.row=FALSE, verbose=TRUE)

Arguments

bed.fn

the file name of PLINK binary file, genotype information

fam.fn

the file name of first six columns of ".ped", sample or family information; if missing, determine the file name using bed.fn

bim.fn

the file name of extended MAP file with 6 columns, variant information; if missing, determine the file name using bed.fn

gdsfile

character (a GDS file name), or a SeqVarGDSClass object

out.gdsfn

the file name, output a file of SeqArray format

out.fn

the file name of PLINK binary format without extended names

compress.geno

the compression method for "genotype"; optional values are defined in the function add.gdsn

compress.annotation

the compression method for the GDS variables, except "genotype"; optional values are defined in the function add.gdsn

chr.conv

if TRUE, convert numeric chromosome codes 23 to X, 24 to Y, 25 to XY, and 26 to MT

include.pheno

if TRUE, add 'family', 'father', 'mother', 'sex' and 'phenotype' in the FAM file to the output GDS file; FALSE for no phenotype; or a character vector to specify which of the family, father, mother, sex and phenotype variables to be added

optimize

if TRUE, optimize the access efficiency by calling cleanup.gds

digest

a logical value (TRUE/FALSE) or a character ("md5", "sha1", "sha256", "sha384" or "sha512"); add hash codes to the GDS file if TRUE or a digest algorithm is specified

parallel

FALSE (serial processing), TRUE (parallel processing), a numeric value indicating the number of cores, or a cluster object for parallel processing; parallel is passed to the argument cl in seqParallel, see seqParallel for more details

write.rsid

"annot_id": use the node "annotation/id" for the variant IDs; "chr_pos_ref_alt": use the format "chrom_position_ref_alt"; "auto": use "annotation/id" for the variant IDs if it is not a blank string or ".", otherwise use "chrom_position_ref_alt"

multi.row

if TRUE, a multiallelic site is converted to multiple rows in PLINK bim and bed files

verbose

if TRUE, show information

Value

Return the file name of SeqArray file with an absolute path.

Author(s)

Xiuwen Zheng

See Also

seqSNP2GDS, seqVCF2GDS

Examples

library(SNPRelate)

# PLINK BED files
bed.fn <- system.file("extdata", "plinkhapmap.bed.gz", package="SNPRelate")
fam.fn <- system.file("extdata", "plinkhapmap.fam.gz", package="SNPRelate")
bim.fn <- system.file("extdata", "plinkhapmap.bim.gz", package="SNPRelate")

# convert bed to gds
seqBED2GDS(bed.fn, fam.fn, bim.fn, "tmp.gds")

seqSummary("tmp.gds")


# convert gds to bed
gdsfn <- seqExampleFileName("gds")
seqGDS2BED(gdsfn, "plink")


# remove the temporary file
unlink(c("tmp.gds", "plink.fam", "plink.bim", "plink.bed"), force=TRUE)

zhengxwen/SeqArray documentation built on Jan. 10, 2025, 9:09 p.m.