knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE ) library(BiocStyle)
if (!require("BiocManager")) { install.packages("BiocManager") } BiocManager::install("spicyR")
# load required packages library(spicyR) library(ggplot2) library(SpatialExperiment) library(SpatialDatasets) library(imcRtools) library(dplyr) library(survival)
This guide provides step-by-step instructions on how to apply a linear model to multiple segmented and labelled images to assess how the localisation of different cell types changes across different disease conditions.
We use the @keren2018 breast cancer dataset to compare the spatial distribution of immune cells in individuals with different levels of tumour infiltration (cold and compartmentalised).
The data is stored as a SpatialExperiment
object and contains single-cell spatial data from 41 images.
kerenSPE <- SpatialDatasets::spe_Keren_2018()
The cell types in this dataset includes 11 immune cell types (double negative CD3 T cells, CD4 T cells, B cells, monocytes, macrophages, CD8 T cells, neutrophils, natural killer cells, dendritic cells, regulatory T cells), 2 structural cell types (endothelial, mesenchymal), 2 tumour cell types (keratin+ tumour, tumour) and one unidentified category.
To investigate changes in localisation between two different cell types, we measure the level of localisation between two cell types by modelling with the L-function. The L-function is a variance-stabilised K-function given by the equation
$$ \widehat{L_{ij}} (r) = \sqrt{\frac{\widehat{K_{ij}}(r)}{\pi}} $$
with $\widehat{K_{ij}}$ defined as
$$ \widehat{K_{ij}} (r) = \frac{|W|}{n_i n_j} \sum_{n_i} \sum_{n_j} 1 {d_{ij} \leq r } e_{ij} (r) $$
where $\widehat{K_{ij}}$ summarises the degree of co-localisation of cell type $j$ with cell type $i$, $n_i$ and $n_j$ are the number of cells of type $i$ and $j$, $|W|$ is the image area, $d_{ij}$ is the distance between two cells and $e_{ij} (r)$ is an edge correcting factor.
Specifically, the mean difference between the experimental function and the theoretical function is used as a measure for the level of localisation, defined as
$$ u = \sum_{r' = r_{\text{min}}}^{r_{\text{max}}} \widehat L_{ij, \text{Experimental}} (r') - \widehat L_{ij, \text{Poisson}} (r') $$
where $u$ is the sum is taken over a discrete range of $r$ between $r_{\text{min}}$ and $r_{\text{max}}$. Differences of the statistic $u$ between two conditions is modelled using a weighted linear model.
Firstly, we can test whether one cell type tends to be more localised with another cell type in one condition compared to the other. This can be done using the spicy()
function, where we specify the condition
parameter.
In this example, we want to see whether or not neutrophils (to
) tend to be found around CD8 T cells (from
) in compartmentalised tumours compared to cold tumours. Given that there are 3 conditions, we can specify the desired conditions by setting the order of our condition
factor. spicy
will choose the first level of the factor as the base condition and the second level as the comparison condition. spicy
will also naturally coerce the condition
column into a factor if it is not already a factor. The column containing cell type annotations can be specified using the cellType
argument. By default, spicy
uses the column named cellType
in the SpatialExperiment
object.
spicyTestPair <- spicy( kerenSPE, condition = "tumour_type", from = "CD8_T_cell", to = "Neutrophils" ) topPairs(spicyTestPair)
We obtain a spicy
object which details the results of the
modelling performed. The topPairs()
function can be used to obtain the associated coefficients and p-value.
As the coefficient
in spicyTestPair
is positive, we find that neutrophils are significantly more likely to be found near CD8 T cells in the compartmentalised tumours group compared to the cold tumour group.
We can perform what we did above for all pairwise combinations of cell
types by excluding the from
and to
parameters in spicy()
.
spicyTest <- spicy( kerenSPE, condition = "tumour_type" ) topPairs(spicyTest)
Again, we obtain a spicy
object which outlines the result of the linear
models performed for each pairwise combination of cell types.
We can also examine the L-function metrics of individual images by using the
convenient bind()
function on our spicyTest
results object.
bind(spicyTest)[1:5, 1:5]
The results can be represented as a bubble plot using the signifPlot()
function.
signifPlot( spicyTest, breaks = c(-3, 3, 1), marksToPlot = c("Macrophages", "DC_or_Mono", "dn_T_CD3", "Neutrophils", "CD8_T_cell", "Keratin_Tumour") )
Here, we can observe that the most significant relationships occur between macrophages and double negative CD3 T cells, suggesting that the two cell types are far more dispersed in compartmentalised tumours compared to cold tumours.
To examine a specific cell type-cell type relationship in more detail, we can use spicyBoxplot()
and specify either from = "Macrophages"
and to = "dn_T_CD3"
or rank = 1
.
spicyBoxPlot(results = spicyTest, # from = "Macrophages", # to = "dn_T_CD3" rank = 1)
spicyR
can also be applied to custom distance or abundance metrics. A kNN interactions graph can be generated with the function buildSpatialGraph
from the imcRtools
package. This generates a colPairs
object inside of the SpatialExperiment
object.
spicyR
provides the function convPairs
for converting a colPairs
object into an abundance matrix by calculating the average number of nearby cells types for every cell type for a given k
. For example, if there exists on average 5 neutrophils for every macrophage in image 1, the column Neutrophil__Macrophage
would have a value of 5 for image 1.
kerenSPE <- imcRtools::buildSpatialGraph(kerenSPE, img_id = "imageID", type = "knn", k = 20, coords = c("x", "y")) pairAbundances <- convPairs(kerenSPE, colPair = "knn_interaction_graph") head(pairAbundances["B_cell__B_cell"])
The custom distance or abundance metrics can then be included in the analysis with the alternateResult
parameter. The Statial
package contains other custom distance metrics which can be used with spicy
.
spicyTestColPairs <- spicy( kerenSPE, condition = "tumour_type", alternateResult = pairAbundances, weights = FALSE ) topPairs(spicyTestColPairs)
signifPlot( spicyTestColPairs, breaks = c(-3, 3, 1), marksToPlot = c("Macrophages", "dn_T_CD3", "CD4_T_cell", "B_cell", "DC_or_Mono", "Neutrophils", "CD8_T_cell") )
spicy
can also be used to perform survival analysis to asses whether changes in co-localisation between cell types are associated with survival probability. spicy
requires the SingleCellExperiment
object being used to contain a column called survival
as a Surv
object.
kerenSPE$event = 1 - kerenSPE$Censored kerenSPE$survival = Surv(kerenSPE$`Survival_days_capped*`, kerenSPE$event)
We can then perform survival analysis using the spicy
function by specifying condition = "survival"
. We can then access the corresponding coefficients and p-values by accessing the survivalResults
slot in the spicy
results object.
# Running survival analysis spicySurvival = spicy(kerenSPE, condition = "survival") # top 10 significant pairs head(spicySurvival$survivalResults, 10)
The spicy
function can also account for tissue inhomogeneity to avoid false positives or negatives. This can be done by setting the sigma =
parameter within the spicy function. By default, sigma
is set to NULL
, and spicy
assumes a homogeneous tissue structure.
For example, when we examine the L-function for Keratin_Tumour__Neutrophils
when sigma = NULL
and Rs = 100
, the value is positive, indicating attraction between the two cell types.
# filter SPE object to obtain image 24 data kerenSubset = kerenSPE[, colData(kerenSPE)$imageID == "24"] pairwiseAssoc = getPairwise(kerenSubset, sigma = NULL, Rs = 100) |> as.data.frame() pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
When we specify sigma = 20
and re-calculate the L-function, it indicates that there is no relationship between Keratin_Tumour
and Neutrophils
, i.e., there is no major attraction or dispersion, as it now takes into account tissue inhomogeneity.
pairwiseAssoc = getPairwise(kerenSubset, sigma = 20, Rs = 100) |> as.data.frame() pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
We can use the plotImage
function to plot any pair of cell types for a specific image and visually inspect cellular relationships.
plotImage(kerenSPE, "24", from = "Keratin_Tumour", to = "Neutrophils")
Plotting image 24 shows that the supposed co-localisation occurs due to the dense cluster of cells near the bottom of the image.
spicyR
supports mixed effects modelling when multiple images are obtained for each subject. In this case, subject
is treated as a random effect and condition
is treated as a fixed effect. To perform mixed effects modelling, we can specify the subject
parameter in the spicy
function.
spicyMixedTest <- spicy( diabetesData, condition = "stage", subject = "case" )
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.