Zanotelli_2020_Spheroids | R Documentation |
Obtain the Zanotelli_2020_Spheroids dataset, which consists of three data objects: single cell data, multichannel images and cell segmentation masks. The data were obtained by imaging mass cytometry (IMC) of sections of 3D spheroids generated from different cell lines.
Zanotelli_2020_Spheroids(
data_type = c("sce", "spe", "images", "masks"),
version = "latest",
metadata = FALSE,
on_disk = FALSE,
h5FilesPath = NULL,
force = FALSE
)
data_type |
type of object to load, 'images' for multichannel images or
'masks' for cell segmentation masks. Single cell data are retrieved using
either 'sce' for the |
version |
dataset version. By default, the latest version is returned. |
metadata |
if FALSE (default), the data object selected in
|
on_disk |
logical indicating if images in form of
HDF5Array objects (as .h5 files) should be stored on disk
rather than in memory. This setting is valid when downloading |
h5FilesPath |
path to where the .h5 files for on disk representation
are stored. This path needs to be defined when |
force |
logical indicating if images should be overwritten when files with the same name already exist on disk. |
This is an Imaging Mass Cytometry (IMC) dataset from Zanotelli et al. (2020), consisting of three data objects:
images
contains 517 multichannel images, each containing 51
channels, in the form of a CytoImageList class object.
masks
contains the cell segmentation
masks associated with the images, in the form of a
CytoImageList class object.
sce
contains the single cell data extracted from the
multichannel images using the cell segmentation masks, as well as the
associated metadata, in the form of a
SingleCellExperiment. This represents a total of 229,047
cells x 51 channels.
spe
same single cell data as for sce
, but in the
SpatialExperiment format.
All data are downloaded from ExperimentHub and cached for local re-use.
Mapping between the three data objects is performed via variables located in
their metadata columns: mcols()
for the CytoImageList
objects and ColData()
for the SingleCellExperiment and
SpatialExperiment objects. Mapping at the image level can be
performed with the image_name
or image_number
variables.
Mapping between cell segmentation masks and single cell data is performed
with the cell_number
variable, the values of which correspond to the
intensity values of the masks
object. For practical examples, please
refer to the "Accessing IMC datasets" vignette.
This dataset was obtained as following (the names of the experimental
variables, located in the colData
of the
SingleCellExperiment and SpatialExperiment
objects, are indicated in parentheses): i) Cells from four different
cell lines (cell_line
) were seeded at three different densities
(treatment_concentration
, relative densities) and grown for either 72
or 96 hours (treatment_time_point
, duration in hours). In the
appropriate experimental conditions (see the paper for details), the cells
aggregate into 3D spheroids. ii) Cells were harvested and pooled into
60-well barcoding plates. iii) A pellet of each spheroid pool was
generated and cut into several 6 um-thick sections. iv) A subset of
these sections (site_id
) were stained with an IMC panel and acquired
as one or more acquisitions (acquisition_id
) containing multiple
spheres each. v) Spheres in these acquisitions were identified by
computer vision and cropped into individual images (image_number
).
Other relevant cell metadata include:
treatment_name
: experimental conditions in the format:
"Cell line name"_c"seeding density"_tp"time point"
.
cell_x/cell_y
: cell centroid position in the image.
cell_area
: area of the cell (um^2).
distance_rim
: estimated distance to spheroid border.
distance_sphere
: distance to spheroid section border.
distance_other_sphere
: distance to the closest of the other
spheroid sections in the same image (if there is any).
distance_background
: distance to background pixels.
For a full description of the other experimental variables, please refer to the publication (https://doi.org/10.15252/msb.20209798) and to the original dataset repository (https://doi.org/10.5281/zenodo.4271910).
The marker-associated metadata, including antibody information and metal
tags are stored in the rowData
of the
SingleCellExperiment and SpatialExperiment
objects. The channels with names starting with "BC_" are the channels used
for barcoding. Post-transcriptional modification of the protein targets are
indicated in brackets.
The assay
slots of the SingleCellExperiment and
SpatialExperiment objects contain three assays:
counts
contains raw mean ion counts per cell.
exprs
contains arsinh-transformed counts, with cofactor 1.
quant_norm
contains counts censored at the 99th percentile
and scaled 0-1.
In addition, the altExp
slot of the
SingleCellExperiment object contains another
SingleCellExperiment object where the counts matrix represents
raw mean ion counts for cells neighboring the current cell.
Neighborhood information, defined here as cells that are localized next to
each other, is stored as a SelfHits
object in the colPairs
slot of the SingleCellExperiment
and SpatialExperiment
objects. Cells in the SelfHits
object are represented by unique integers that map to the
cell_number_absolute
column of colData(sce)
.
Dataset versions: a version
argument can be passed to the function to
specify which dataset version should be retrieved.
`v0`
: original version (Bioconductor <= 3.15).
`v1`
: consistent object formatting across datasets.
File sizes:
`images`
: size in memory = 21.2 Gb, size on disk = 860 Mb.
`masks`
: size in memory = 426 Mb, size on disk = 12 Mb.
`sce`
: size in memory = 564 Mb, size on disk = 319 Mb.
`spe`
: size in memory = 596 Mb, size on disk = 320 Mb.
When storing images on disk, these need to be first fully read into memory before writing them to disk. This means the process of downloading the data is slower than directly keeping them in memory. However, downstream analysis will lose its memory overhead when storing images on disk.
Original source: Zanotelli et al. (2020): https://doi.org/10.15252/msb.20209798
Original link to raw data, also containing the entire dataset: https://doi.org/10.5281/zenodo.4271910
A SingleCellExperiment object with single cell data, a SpatialExperiment object with single cell data, a CytoImageList object containing multichannel images, or a CytoImageList object containing cell segmentation masks.
Nicolas Damond
Zanotelli VRT et al. (2020). A quantitative analysis of the interplay of environment, neighborhood, and cell state in 3D spheroids Mol Syst Biol 16(12), e9798.
# Load single cell data
sce <- Zanotelli_2020_Spheroids(data_type = "sce")
print(sce)
# Display metadata
Zanotelli_2020_Spheroids(data_type = "sce", metadata = TRUE)
# Load masks on disk
library(HDF5Array)
masks <- Zanotelli_2020_Spheroids(data_type = "masks", on_disk = TRUE,
h5FilesPath = getHDF5DumpDir())
print(head(masks))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.