writeVcf-methods: Write VCF files

writeVcfR Documentation

Write VCF files

Description

Write Variant Call Format (VCF) files to disk

Usage

## S4 method for signature 'VCF,character'
writeVcf(obj, filename, index = FALSE, ...)
## S4 method for signature 'VCF,connection'
writeVcf(obj, filename, index = FALSE, ...)

Arguments

obj

Object containing data to be written out. At present only accepts VCF.

filename

The character() name of the VCF file, or a connection (e.g., file()), to be written out. A connection opened with open = "a" will have header information written only if the file does not already exist.

index

Whether to bgzip the output file and generate a tabix index.

...

Additional arguments, passed to methods.

  • nchunk: Integer or NA. When provided this argument overrides the default chunking behavior of writeVcf, see Details section. An integer value specifies the number of records in each chunk; NA disables chunking.

Details

A VCF file can be written out from data in a VCF object. More general methods to write out from other objects may be added in the future.

writeVcf writes out the header fields in a VCF object 'as-is' with the exception of these key-value pairs:

  • fileformat: When missing, a line is added at the top of the file with the current supported version. VariantAnnotation >=1.27.6 supports VCFv4.3.

  • fileDate: When missing, a line is added with today's date. If the key-value pair exists, the date is overwritten with today's date.

  • contig: When missing, VariantAnnotation attempts to use the Seqinfo of the VCF object to determine the contig information.

Large VCF files (i.e., > 1e5 records) are written out in chunks; VCF files with < 1e5 records are not chunked. The optimal number of records per chunk depends on both the number of records and complexity of the data. Currently writeVcf determines records per chunk based on the total number of records only. To override this behavior or experiment with other values use nchunk as an integer or NA. An integer value represents the number of records per chunk regardless of the size of the VCF; NA disables all chunking.

  • writeVcf(vcf, tempfile()) ## default chunking

  • writeVcf(vcf, tempfile(), nchunk = 1e6) ## chunk by 1e6

  • writeVcf(vcf, tempfile(), nchunk = NA) ## no chunking

Value

VCF file

Note

NOTE: VariantAnnotation >= 1.27.6 supports VCFv4.3. See the NOTE on the ?VCFHeader man page under the meta() extractor for a description of how header parsing has changed to accommodate the new header lines with key name of 'META'.

Author(s)

Valerie Obenchain and Michael Lawrence

References

http://vcftools.sourceforge.net/specs.html outlines the VCF specification.

http://samtools.sourceforge.net/mpileup.shtml contains information on the portion of the specification implemented by bcftools.

http://samtools.sourceforge.net/ provides information on samtools.

See Also

readVcf

Examples

  fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
 
  out1.vcf <- tempfile()
  out2.vcf <- tempfile() 
  in1 <- readVcf(fl, "hg19")
  writeVcf(in1, out1.vcf)
  in2 <- readVcf(out1.vcf, "hg19")
  writeVcf(in2, out2.vcf)
  in3 <- readVcf(out2.vcf, "hg19")
  stopifnot(all(in2 == in3))

  ## write incrementally
  out3.vcf <- tempfile()
  con <- file(out3.vcf, open="a")
  writeVcf(in1[1:2,], con)
  writeVcf(in1[-(1:2),], con)
  close(con)
  readVcf(out3.vcf, "hg19")

Bioconductor/VariantAnnotation documentation built on Jan. 9, 2025, 12:03 a.m.