DataFrame-class | R Documentation |
The DataFrame
class extends the RectangularData virtual
class supports the storage of any type of object (with length
and [
methods) as columns.
On the whole, the DataFrame
behaves very similarly to
data.frame
, in terms of construction, subsetting, splitting,
combining, etc. The most notable exceptions have to do with handling
of the row names:
The row names are optional. This means calling rownames(x)
will return NULL
if there are no row names. Of course, it
could return seq_len(nrow(x))
, but returning NULL
informs, for example, combination functions that no row names are
desired (they are often a luxury when dealing with large data).
The row names are not required to be unique.
Subsetting by row names does not use partial matching.
As DataFrame
derives from Vector
, it is
possible to set an annotation
string. Also, another
DataFrame
can hold metadata on the columns.
For a class to be supported as a column, it must have length
and [
methods, where [
supports subsetting only by
i
and respects drop=FALSE
. Optionally, a method may be
defined for the showAsCell
generic, which should return a
vector of the same length as the subset of the column passed to
it. This vector is then placed into a data.frame
and converted
to text with format
. Thus, each element of the vector should be
some simple, usually character, representation of the corresponding
element in the column.
DataFrame(..., row.names = NULL, check.names = TRUE,
stringsAsFactors)
:Constructs a DataFrame
in similar fashion to
data.frame
. Each argument in ...
is coerced to
a DataFrame
and combined column-wise.
The row names should be given in
row.names
; otherwise, they are inherited from the
arguments, as in data.frame
. Explicitly passing
NULL
to row.names
ensures that there are no rownames.
If check.names
is TRUE
, the column names will
be checked for syntactic validity and made unique, if necessary.
To store an object of a class that does not support coercion to
DataFrame
, wrap it in I()
. The class must still have
methods for length
and [
.
The stringsAsFactors
argument is ignored. The coercion of
column arguments to DataFrame determines whether strings
become factors.
make_zero_col_DFrame(nrow)
:Constructs a zero-column DFrame object with nrow
rows.
Intended for developers to use in other packages and typically
not needed by the end user.
In the following code snippets, x
is a DataFrame
.
dim(x)
:Get the length two integer vector indicating in the first and second element the number of rows and columns, respectively.
dimnames(x)
, dimnames(x) <- value
:Get and set the two element list containing the row names
(character vector of length nrow(x)
or NULL
)
and the column names (character vector of length ncol(x)
).
as(from, "DataFrame")
:By default, constructs a new DataFrame
with from
as
its only column. If from
is a matrix
or
data.frame
, all of its columns become columns in the new
DataFrame
. If from
is a list, each element becomes a
column, recycling as necessary. Note that for the DataFrame
to behave correctly, each column object must support element-wise
subsetting via the [
method and return the number of elements with
length
. It is recommended to use the DataFrame
constructor, rather than this interface.
as.list(x)
: Coerces x
, a DataFrame
,
to a list
.
as.data.frame(x, row.names=NULL, optional=FALSE,
make.names=TRUE)
:Coerces x
, a DataFrame
, to a data.frame
.
Each column is coerced to a data.frame
and then column
bound together. If row.names
is NULL
, they are
propagated from x
, if it has any. Otherwise, they are
inferred by the data.frame
constructor.
Like the as.data.frame()
method for class matrix
,
the method for class DataFrame
supports the make.names
argument. make.names
can be set to TRUE
or FALSE
to indicate what should happen if the row names of x
(or the
row names supplied via the row.names
argument) are invalid
(e.g. contain duplicates). If they are invalid, and make.names
is TRUE
(the default), they get "fixed" by going thru
make.names(*, unique=TRUE)
. Otherwise (i.e. if make.names
is FALSE
), an error is raised. Note that unlike the method
for class matrix
, make.names=NA
is not supported.
NOTE: Conversion of x
to a data.frame
is not
supported if x
contains any list
, SimpleList
,
or CompressedList
columns.
as(from, "data.frame")
: Coerces a DataFrame
to a data.frame
by calling as.data.frame(from)
.
as.matrix(x)
: Coerces the DataFrame
to a
matrix
, if possible.
as.env(x, enclos = parent.frame())
:Creates an environment from x
with a symbol for each
colnames(x)
. The values are not actually copied into the
environment. Rather, they are dynamically bound using
makeActiveBinding
. This prevents unnecessary copying
of the data from the external vectors into R vectors. The values
are cached, so that the data is not copied every time the symbol
is accessed.
In the following code snippets, x
is a DataFrame
.
x[i,j,drop]
: Behaves very similarly to the
[.data.frame
method, except i
can be a
logical Rle
object and subsetting by matrix
indices
is not supported. Indices containing NA
's are also not
supported.
x[i,j] <- value
: Behaves very similarly to the
[<-.data.frame
method.
x[[i]]
: Behaves very similarly to the
[[.data.frame
method, except arguments j
and exact
are not supported. Column name matching is
always exact. Subsetting by matrices is not supported.
x[[i]] <- value
: Behaves very similarly to the
[[<-.data.frame
method, except argument j
is not supported.
The show()
method for DataFrame objects obeys global options
showHeadLines
and showTailLines
for controlling the number
of head and tail rows to display.
See ?get_showHeadLines
for more information.
Michael Lawrence
DataFrame-combine for combining DataFrame objects.
DataFrame-utils for other common operations on DataFrame objects.
TransposedDataFrame objects.
RectangularData and SimpleList which DataFrame extends directly.
get_showHeadLines
for controlling the number of
DataFrame rows to display.
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]
# arrays
ary <- array(1:4, c(2,1,2))
sw <- DataFrame(I(ary))
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss
# subsetting
sw[] # identity subset
sw[,] # same
sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows
## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]
## select rows and columns
sw[4:5, 1:3]
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]
sw[["Fert"]] # should return 'NULL'
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list
## duplicate row, unique row names are created
sw[c(1, 1:2),]
## indexing by row names
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # no partial match (unlike with data.frame)
## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)
## column replacement
df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.