Introduction

ImmPort study data is available for download in two formats currently: MySQL and TSV (Tab) formats. The RImmPort workflow is as follows: 1) MySQL formatted study data: User downloads one or more studies in MySQL zip files. Unzips the files. Loads local database instance. Connects to the database. Sets the ImmPort data source to the connection handle. Invokes RImmPort functions. 2) Tab: User downloads one or more studies in Tab format. Passes the folder where the zip files are located to an RImmPort function that builds SQLite database. Connects to the database. Sets the ImmPort data source to the connection handle. Invokes RImmPort functions.

User downloads study data of interest from the ImmPort website ( http://www.immport.org ) **. Depending on the file format MySQL or Tab the data is loaded into a local MySQL and SQLite database respectively. The user installs the RImmPort package, loads the RImmPort library, connects to the ImmPort database, and calls RImmPort methods to load study data from the database into R. Please refer to RImmPort_Article.pdf for a detailed discussion on RImmPort.

** User need to regsiter to the ImmPort website for downloading the datasets.

Initial Steps

Load the RImmPort library

library(RImmPort)
library(DBI)
library(sqldf)
library(plyr)

Setup ImmPort data source that all RImmPort functions will use

Option 1: ImmPort MySQL database

Download zip files of ImmPort study data in MySQL format. e.g.'SDY139' and 'SDY208'

Load the data into a local MySQL database

Connect to the ImmPort MySQL database.

# provide appropriate connection parameters
mysql_conn <- dbConnect(MySQL(), user="username", password="password", 
                   dbname="database",host="host")

Set the data source as the ImmPort MySQL database.

setImmPortDataSource(mysql_conn)

Option 2: ImmPort SQLite database

Download zip files of ImmPort data, in Tab format. e.g.'SDY139' and 'SDY208'

# get the directory where ImmPort sample data is stored in the directory structure of RImmPort package
studies_dir <- system.file("extdata", "ImmPortStudies", package = "RImmPort")

# set tab_dir to the folder where the zip files are located
tab_dir <- file.path(studies_dir, "Tab")
list.files(tab_dir)

Build a local SQLite ImmPort database instance.

# set db_dir to the folder where the database file 'ImmPort.sqlite' should be stored
db_dir <- file.path(studies_dir, "Db")
# build a new ImmPort SQLite database with the data in the downloaded zip files
buildNewSqliteDb(tab_dir, db_dir) 
list.files(db_dir)

Connect to the ImmPort SQLite database

# get the directory of a sample SQLite database that has been bundled into the RImmPort package
db_dir <- system.file("extdata", "ImmPortStudies", "Db", package = "RImmPort")

# connect to the private instance of the ImmPort database
sqlite_conn <- dbConnect(SQLite(), dbname=file.path(db_dir, "ImmPort.sqlite"))

Set the data source to the ImmPort SQLite DB

setImmPortDataSource(sqlite_conn)

NOTE: In rest of the script, all RImmPort functions will use the SQLite ImmPort database as the data source.

Get all study ids

getListOfStudies()

Get all data of a specific study

The getStudyFromDatabase queries the ImmPort database for the entire dataset of a specific study, and instantiates the Study reference class with that data.

?Study

# load all the data of study: `SDY139`
study_id <- 'SDY139'
sdy139 <- getStudy(study_id)

# access Demographics data of SDY139
dm_df <- sdy139$special_purpose$dm_l$dm_df
head(dm_df)

# access Concomitant Medications data of SDY139
cm_df <- sdy139$interventions$cm_l$cm_df
head(cm_df)

# get Trial Title from Trial Summary
ts_df <- sdy139$trial_design$ts_l$ts_df
title <- ts_df$TSVAL[ts_df$TSPARMCD== "TITLE"]
title

Get the list of Domain names.

Note that some RImmPort functions take a domain name as input.

# get the list of names of all supported Domains
getListOfDomains()

?"Demographics Domain"

Get list of studies with specifc domain data

The Domain name should be exact to what is found in the list of Domain names.

# get list of studies with Cellular Quantification data
domain_name <- "Cellular Quantification"
study_ids_l <- getStudiesWithSpecificDomainData(domain_name)
study_ids_l

Get specifc domain data of one or more studies

The Domain name should be exact to what is found in the list of Domain names.

# get Cellular Quantification data of studies `SDY139` and `SDY208`

# get domain code of Cellular Quantification domain
domain_name <- "Cellular Quantification"
getDomainCode(domain_name)

study_ids <- c("SDY139", "SDY208")
domain_name <- "Cellular Quantification"
zb_l <- getDomainDataOfStudies(domain_name, study_ids)
if (length(zb_l) > 0) 
  names(zb_l)
head(zb_l$zb_df)

Get the list of assay types from ImmPort studies

getListOfAssayTypes()

Get specific assay data of one or more Immport studies

The assay type should be exact to what is found in the list of supported assay types.

# get 'ELISPOT' data of study `SDY139`
assay_type <- "ELISPOT"
study_id = "SDY139"
elispot_l <- getAssayDataOfStudies(study_id, assay_type)
if (length(elispot_l) > 0)
  names(elispot_l)
head(elispot_l$zb_df)

Serialize RImmPort-formatted study data as .rds files

# serialize all of the data of studies `SDY139` and `SDY208'
study_ids <- c('SDY139', 'SDY208')

# the folder where the .rds files will be stored
rds_dir <- file.path(studies_dir, "Rds")

serialzeStudyData(study_ids, rds_dir)
list.files(rds_dir)

Load the serialzed data (.rds) files of a specific domain of a study from the directory where the files are located

# get the directory where ImmPort sample data is stored in the directory structure of RImmPort package
studies_dir <- system.file("extdata", "ImmPortStudies", package = "RImmPort")

# the folder where the .rds files will be stored
rds_dir <- file.path(studies_dir, "Rds")

# list the studies that have been serialized
list.files(rds_dir)

# load the serialized data of study `SDY208` 
study_id <- 'SDY208'
dm_l <- loadSerializedStudyData(rds_dir, study_id, "Demographics")
head(dm_l[[1]])


hzc363/RImmPort documentation built on May 17, 2019, 7:06 p.m.