Description Usage Arguments Details Value Note Author(s) See Also
Reads SNP data when organized in free format
as one call per line. Other than the one
call per line requirement, there is considerable flexibility. Multiple
input files can be read, the input fields can be in any order on the
line, and irrelevant fields can be skipped. The samples and SNPs
to be read must be pre-specified, and define rows and columns of an
output object of class "SnpMatrix"
. This function has been
replaced in versions 1.3 and later by the more flexible function
read.long
.
1 2 3 4 5 |
files |
A character vector giving the names of the input files |
sample.id |
A character vector giving the identifiers of the samples to be read |
snp.id |
A character vector giving the names of the SNPs to be read |
diploid |
A logical array of the same length as
|
fields |
A integer vector with named elements specifying the
positions of the required fields in the input record. The fields are
identified by the names |
codes |
Either the single string |
threshold |
A numerical value for the calling threshold on the confidence score |
lower |
If |
sep |
The delimiting character separating fields in the input record |
comment |
A character denoting that any remaining input on a line is to be ignored |
skip |
An integer value specifying how many lines are to be skipped at the beginning of each data file |
simplify |
If |
verbose |
If |
in.order |
If |
every |
See |
If nucleotide coding is not used, the codes
argument
should be a character array giving the valid codes.
For genotype coding of autosomal SNPs, this should be
an array of length 3 giving the codes
for the three genotypes, in the order homozygous(AA), heterozygous(AB),
homozygous(BB). All other codes will be treated
as "no call". The default codes are "0"
, "1"
,
"2"
. For X SNPs, males are assumed to be coded as homozygous,
unless an additional two codes are supplied (representing the
AY and BY genotypes). For allele coding, the
codes
array should be of length 2 and should specify the codes
for the two alleles. Again, any other code is treated as
"missing" and, for X SNPs, males should be coded either as
homozygous or by omission of the second allele.
For nucleotide coding, nucleotides are assigned to the nominal alleles in alphabetic order. Thus, for a SNP with either "T" and "A" nucleotides in the variant position, the nominal genotypes AA, AB and BB will refer to A/A, A/T and T/T.
Although the function allows for reading into an object of class
XSnpMatrix
directly,
it is usually preferable to read such data as a "SnpMatrix"
(i.e. as autosomal) and to coerce it to an object of type
"XSnpMatrix"
later using as(..., "X.SnpMatrix")
or
new("XSnpMatrix", ..., diploid=...)
. If diploid
is coded NA
for any subject the latter course must be
followed, since NA
s are not accepted in the diploid
argument.
If the in.order
argument is set TRUE
, then
the vectors sample.id
and snp.id
must be in the same
order as they vary on the input file(s) and this ordering must be
consistent. However, there is
no requirement that either SNP or sample should vary fastest as this is
detected from the input. If in.order
is FALSE
, then no
assumptions about the ordering of the input file are assumed and SNP
and sample identifiers are looked up in hash tables as they are
read. This option must be expected, therefore, to be somewhat slower.
Each file may represent a separate sample or SNP, in which case the
appropriate .id
argument can be omitted; row or column names
are then taken from the file names.
An object of class "SnpMatrix"
or "XSnpMatrix"
.
The function will read gzipped files.
If in.order
is TRUE
,
every combination of sample and snp listed in the
sample.id
and snp.id
arguments must be present in the
input file(s). Otherwise the function will search for any missing
observation until reaching the end of the data, ignoring everything
else on the way.
David Clayton dc208@cam.ac.uk
read.plink
,
SnpMatrix-class
, XSnpMatrix-class
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.