gbsrGDS2CSV: Write a CSV file based on data in a GDS file

gbsrGDS2CSVR Documentation

Write a CSV file based on data in a GDS file

Description

Write out a CSV file with raw, filtered, corrected genotype data or estimated haplotype data stored in a GDS file.

Usage

gbsrGDS2CSV(
  object,
  out_fn,
  node = "raw",
  incl_parents = TRUE,
  bp2cm = NULL,
  format = "",
  read = FALSE,
  ...
)

## S4 method for signature 'GbsrGenotypeData'
gbsrGDS2CSV(object, out_fn, node, incl_parents, bp2cm, format, read)

Arguments

object

A GbsrGenotypeData object.

out_fn

A string to specify the path to an output VCF file.

node

Either one of "raw", "filt", "cor", "hap", "dosage to output raw genotype data, filtered genotype data, corrected genotype data, estimated haplotype data, and estimated allele dosage data, respectively.

incl_parents

A logical value to specify whether parental samples should be included in an output VCF file or not.

bp2cm

A numeric value to convert positions in basepairs (bp) to centiMorgan (cm). The specified here is used to multiply position values. The default is NULL and then internally sets bp2cm = 4e-06 when ⁠format = "qtl⁠. If not ⁠format = "qtl⁠, 1 is set to bp2cm as default.

format

A string to indicate the output format. See details.

read

A logical value to indicate whether read counts should be output with genotype data or not. See details.

...

Unused.

Details

Create a CSV file at location specified by out_fn. The setting format = "qtl" makes the function export the data in the r/qtl format that can be loaded using read.cross as format = "csvs" with a phenotype data. If you have executed estGeno() and your population is a biparental population, set 'node = "dosage"' to export a r/qtl format CSV in which homozygoutes of the alleles of Parent 1 and 2, which have been specified by setParents(), are represented by A and B, respectively. If 'node = "raw"', 'node = "fill"', and 'node = "cor"', A and B in the r/qtl format CSV indicate homozygoutes of reference and alternative alleles shown in a given VCF file. This means that if Parent 1 has the alternative allele homozygoute at Marker 1 and Offspring 1 has the same genotype with Parent 1, the genotype of Offspring 1 at Marker 1 will be B in the r/qtl format CSV. On the other hand, if you set 'node = "dosage"', the genotype of Offspring 1 at Marker 1 will be A in the r/qtl format CSV. The output CSV file has the rows indicating chromosome ID and positions of markers followed by the rows indicating genotype or haplotype data of samples. If read = TRUE, the output of each genotype call would be in the form of ⁠GT:ADR,ADA⁠ where GT, ADR, and ADA represent genotype, referenece read count, and alternative read count, respectively. If format = "qtl", read = TRUE will be ignored.

Value

The path to the CSV file.

Examples

# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds_fn <- system.file("extdata", "sample.gds", package = "GBScleanR")
gds <- loadGDS(gds_fn)

# Create a CSV file with data from the GDS file
#  connected to the [GbsrGenotypeData] oobject.
out_fn <- tempfile("sample_out", fileext = ".csv")
gbsrGDS2CSV(gds, out_fn)

# Close the connection to the GDS file.
closeGDS(gds)


tomoyukif/GBScleanR documentation built on Oct. 31, 2024, 2:43 a.m.