SQLDataFrame | R Documentation |
Create a SQL-backed DataFrame, where the data are
kept on disk until requested. Direct extension classes are
SQLiteDataFrame
and DuckDBDataFrame
.
SQLDataFrame(path, dbtype = NULL, table = NULL, columns = NULL, nrows = NULL)
path |
String containing a path to a SQL file. |
dbtype |
String containing the SQL database type (case insensitive). Supported types are "SQLite" and "DuckDB". |
table |
String containing the name of SQL table. |
columns |
Character vector containing the names of columns in
a SQL table. If |
nrows |
Integer scalar specifying the number of rows in a SQL
table. If |
The SQLDataFrame is essentially just a DataFrame of SQLColumnVector objects. It is primarily useful for indicating that the in-memory representation is consistent with the underlying SQL file (e.g., no delayed filter/mutate operations have been applied, no data has been added from other files). Thus, users can specialize code paths for a SQLDataFrame to operate directly on the underlying SQL table.
In that vein, operations on a SQLDataFrame may return another SQLDataFrame if the operation does not introduce inconsistencies with the file-backed data. For example, slicing or combining by column will return a SQLDataFrame as the contents of the retained columns are unchanged. In other cases, the SQLDataFrame will collapse to a regular DFrame of SQLColumnVector objects before applying the operation; these are still file-backed but lack the guarantee of file consistency.
A SQLDataFrame where each column is a SQLColumnVector.
Qian Liu
## Mocking up a file:
### SQLite
tf <- tempfile()
on.exit(unlink(tf))
con <- DBI::dbConnect(RSQLite::SQLite(), tf)
DBI::dbWriteTable(con, "mtcars", mtcars)
DBI::dbDisconnect(con)
### DuckDB
tf1 <- tempfile()
on.exit(unlist(tf1))
con <- DBI::dbConnect(duckdb::duckdb(), tf1)
DBI::dbWriteTable(con, "mtcars", mtcars)
DBI::dbDisconnect(con)
## Creating a SQLite-backed data frame:
df <- SQLDataFrame(tf, dbtype = "SQLite", table = "mtcars")
df1 <- SQLiteDataFrame(tf, "mtcars")
identical(df, df1)
## DuckDB-backed data frame:
df2 <- SQLDataFrame(tf1, dbtype = "duckdb", table = "mtcars")
df3 <- DuckDBDataFrame(tf1, "mtcars")
identical(df2, df3)
## Extraction yields a SQLiteColumnVector:
df$carb
## Some operations preserve the SQLDataFrame:
df[,1:5]
combined <- cbind(df, df)
class(combined)
## ... but most operations collapse to a regular DFrame:
df[1:5,]
combined2 <- cbind(df, some_new_name=df[,1])
class(combined2)
df1 <- df
rownames(df1) <- paste0("row", seq_len(nrow(df1)))
class(df1)
df2 <- df
colnames(df2) <- letters[1:ncol(df2)]
class(df2)
df3 <- df
df3$carb <- mtcars$carb
class(df3)
## Utility functions
path(df)
dbtype(df)
sqltable(df)
dim(df)
names(df)
as.data.frame(df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.