The muscData
package contains a set of publicly available single-cell RNA sequencing (scRNA-seq) datasets with complex experimental designs, i.e., datasets that contain multiple samples (e.g., individuals) measured across multiple experimental conditions (e.g., treatments), formatted into SingleCellExperiment
(SCE) Bioconductor objects. Data objects are hosted through Bioconductor's ExperimentHub web resource.
The table below gives an overview of currently available datasets, including a unique identifier (ID) that can be used to load the data (see next section), a brief description, the original data source, and a reference. Dataset descriptions may also be viewed from within R via ?ID
(e.g., ?Kang18_8vs8
).
ID | Description | Availability | Reference
---|-------------|--------------|----------
Kang18_8vs8
| 10x droplet-based scRNA-seq PBMC data from 8 Lupus patients before and after 6h-treatment with INF-beta (16 samples in total) | Gene Expression Ombnibus (GEO) accession GSE96583 | @Kang2018
Crowell19_4vs4
| Single-nuclei RNA-seq data of 8 CD-1 male mice, split into 2 groups with 4 animals each: vehicle and peripherally lipopolysaccharaide (LPS) treated mice | Figshare DOI:10.6084/m9.figshare.8976473.v1 | @Crowell2019-muscat
All datasets available within muscData
may be loaded either via named functions that directly reffer to the object names, or by using the ExperimentHub
interface. Both methods are demonstrated below.
The datasets listed above may be loaded into R by their ID. All provided SCEs contain unfiltered raw counts in their assay
slot, and any available gene and cell metadata in the rowData
and colData
slots, respectively.
library(muscData) Kang18_8vs8()
ExperimentHub
Besides using an accession function as demonstrated above, we can browse ExperimentHub records (using query
) or package specific records (using listResources
), and then load the data of interest. The key differences between these approaches is that query
will search all of ExperimentHub, while listResources
facilitate data discovery within the specified package (here, muscData
).
query
We first initialize a Hub instance to search for and load available data with the ExperimentHub
function, and store the complete list of >2000 records in a variable eh
. Using query
, we then identify any records made available by muscData
, as well as their accession IDs (EH1234). Finally, we can load the data into R via eh[[id]]
.
# create Hub instance library(ExperimentHub) eh <- ExperimentHub() (q <- query(eh, "muscData"))
# load data via accession ID eh[["EH2259"]]
list/loadResources
Alternatively, available records may be viewed via listResources
. To then load a specific dataset or subset thereof using loadResources
, we require a character vector of metadata search terms to filter by.
Available metadata can accessed from the ExperimentHub records found by query
via mcols()
, or viewed using the accessors shown above with option metadata = TRUE
. In the example below, we use "PMBC"
and "INF-beta"
to select the Kang18_8vs8
dataset. However, note that any metadata keyword(s) that uniquely identify the data of interest could be used (e.g., "Lupus"
or "GSE96583"
).
listResources(eh, "muscData")
# view metadata mcols(q) Kang18_8vs8(metadata = TRUE) # load data using metadata search terms loadResources(eh, "muscData", c("PBMC", "INF-beta"))
The r Biocpkg("scater")
[@McCarthy2017] package provides an easy-to-use set of visualization tools for scRNA-seq data.
For interactive visualization, we recommend the r Biocpkg("iSEE")
(interactive SummerizedExperiment Explorer) package [@Albrecht2018], which provides a Shiny-based graphical user interface for exploration of single-cell data in SummarizedExperiment
format (installation instructions and user guides are available here).
When available, a great tool for interactive exploration and comparison of dimension-reduced embeddings is r CRANpkg("sleepwalk")
[@Ovchinnikova2018].
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.