knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
date: "last edit: 9/30/2019"
Load packages into R session. It will automatically load the package
of dplyr
and dbplyr
.
library(SQLDataFrame) library(DBI)
dbfile <- system.file("extdata/test.db", package = "SQLDataFrame") conn <- DBI::dbConnect(dbDriver("SQLite"), dbname = dbfile) obj <- SQLDataFrame( conn = conn, dbtable = "state", dbkey = "state") obj
To make the SQLDataFrame
object as light and compact as possible,
there are only 5 slots contained in the object: tblData
, dbkey
,
dbnrows
, dbconcatKey
, indexes
. Metadata information could be
returned through these 5 slots using slot accessors or other utility
functions.
slotNames(obj) dbtable(obj) dbkey(obj)
tblData
slotThe tblData
slot saves the dbplyr::tbl_dbi
version of the database
table, which is a light-weight representation of the database table in
R. Of note is that this lazy tbl only contains unique rows. It could
also be sorted by the dbkey(obj)
if the SQLDataFrame
object was
generated from union
or rbind
. So when the saveSQLDataFrame()
function was called, a database table will be written into a physical
disk space and have the unique records.
Accessor function is made avaible for this slot:
tblData(obj)
dbnrows
and dbconcatKey
The dbnrows
slot saves the number of rows corresponding to the
tblData
, and dbconcatKey
saves the realized (concatenated if
multiple) key columns corresponding to the tblData
. Accessor
functions are also available for these 2 slots:
dbnrows(obj) dbconcatKey(obj)
indexes
slotThe indexes
slots is an unnamed list saving the row and column
indexes respectively corresponding to the tblData
slot, so that the
SQLDataFrame
could possibly have duplicate rows or only a subset of
data records from the tblData
, while the tblData
slot doesn't need
to be changed. To be consistent, the slots of dbnrows
and
dbconcatKey
will also remain unchanged.
obj@indexes obj_sub <- obj[sample(5, 3, replace = TRUE), 2:3] obj_sub obj_sub@indexes identical(tblData(obj), tblData(obj_sub))
With a filter
or select
function (which is similar to [i, ]
subsetting), only the indexes
slot will be updated for the row or
column index pointing to the tblData
.
obj_filter <- obj %>% filter(division == "South Atlantic" & size == "medium") obj_filter@indexes identical(tblData(obj), tblData(obj_filter)) obj_select <- obj %>% select(division, size) obj_select@indexes identical(tblData(obj), tblData(obj_select))
The ROWNAMES,SQLDataFrame
method was defined to return the
(concatenated if multiple) key column(s) value, so that the row
subsetting with character vector works for the SQLDataFrame
objects.
rnms <- ROWNAMES(obj) obj[sample(rnms, 3), ]
For SQLDataFrame
object with composite keys:
obj1 <- SQLDataFrame(conn = conn, dbtable = "state", dbkey = c("region", "population")) ROWNAMES(obj1[1:10,]) obj1[c("South:3615.0", "West:365.0"), ]
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.