BiocStyle::markdown()
library("knitr") opts_chunk$set(stop_on_error = 1L) suppressPackageStartupMessages(library("apoptosisQuantification"))
apoptosisQuantification
is currently under active development. If you
discover any bugs, typos or develop ideas of improving
apoptosisQuantification
feel free to raise an issue via
GitHub or
send a mail to the developer.
To install apoptosisQuantification
enter the following to the R
console
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") if (!requireNamespace("devtools", quietly = TRUE)) BiocManager::install("devtools") devtools::install("tnaake/apoptosisQuantification")
When performing cell line experiments it is of outermost interest to capture variation originating from the treatment, while controlling external effects. One such effect is the onset of apoptosis that results in cell death and might overlay the biological effects of interest.
We present here the apoptosisQuantification
package that allows to quantify
the level and effects of apoptosis (and necroptosis). We use here a list of
markers that are deregulated in apoptosis and necroptosis.
Classical necrotic markers include lactate deydrogenase (LDH),high-mobility group B1 (HMGB1), myoglobin, enolase, and 14-3-3 proteins. Commonly, a protein release is observed during apoptosis and necroptosis for human myeloid/lymphoma cell lines and human primary macrophage cells as observed time course experiments (cf. @Tanzer2020). The work of @Tanzer2020 looked at differential expression of proteins during a time-course experiments where histiocytic lympoma cell lines (U937) were induced to apoptosis (TNF+SM treatment, SM: birinapant, TNF: tumor necrosis factor) and necroptosis (TNF+SM+Idun-6556, Idun-6556: caspase inhibitor IDUN-6556).
The readMarkers
function loads a list of apoptosis or necroptosis markers
taken from the time-course experiment of @Tanzer2020.
We first load a tibble
that stores the marker proteins together with the
observed maximum absolute fold change for significant time points and the
number of time points when the difference between treated and control cell lines
were significant.
markers_apoptosis <- readMarkers(type = "apoptosis", fc = 2, n = 1) markers_necroptosis <- readMarkers(type = "necroptosis", fc = 2, n = 1)
The returned tibble
contains three columns:
feature
: name of the marker, change
: maximum/minimum fold change from the time-course experiment
(absolute value is taken to define the maximum change), significant_n
: number of significant time-points.Alternatively, we load a list of hallmark genes that are changed in
programmed cell death. The list is taken from @Liberzon2015
(accessed via
GSEA).
In the following, we create a tibble containing these proteins in the
feature
column.
## truncate the markers with f <- system.file("protein_markers/geneset_HALLMARK_APOPTOSIS.RDS", package = "apoptosisQuantification") hallmark_genes <- readRDS(f) hallmark_genes <- tibble(feature = hallmark_genes)
This approach looks at the distribution of the loadings (from PCA) from markers and non-markers. If there are effects between the different samples caused by apoptosis/necroptosis we expect that the distribution of marker loadings is higher than the distribution of non-marker loadings.
We first load the data set.
f <- system.file("protein_datasets/tanzer2020.RDS", package = "apoptosisQuantification") tanzer2020 <- readRDS(f) ## obtain the assay slot (contains the transformed intensities) library(SummarizedExperiment) prot <- assay(tanzer2020)
We then calculate the loadings of markers and non-markers. We will look here at loadings of principal component 1 and visualize the results (loadings of markers vs. loadings of non-markers).
## plot the distribution as ECDF plotECDF(values = prot, markers = markers_apoptosis, PC = "PC1") ## histogram (distribution of loadings) plotHistogram(values = prot, markers = markers_apoptosis, PC = "PC1") ## violin plot (distribution of loadings) plotViolin(values = prot, markers = markers_apoptosis, PC = "PC1")
We perform a Wilcoxon rank sum test. If significant, it is indicated that the mean of the marker loadings are higher than the mean of the non-marker loadings.
performWilcoxonTest(values = prot, markers = markers_apoptosis, PC = "PC1")
Apoptotic cells show an increase in the proportion of gene expression which originates from the mitochondria, which occurs after the opening of the MPT pore as apoptosis begins. This gradually results in a decrease in the total expression of chromosomal genes as the DNA is degraded. In late apoptosis, there is typically a drastically reduced total count of genes in a cell, with a large proportion of the gene expression stemming from mitochondria.
Note, that the proportions of proteomics datasets might not directly interpreted as done in transcriptomics due to the specificities of proteomics data acquisition.
The function \code{calculateProportionOfMitochondrialProteins} calculates proportions of the intensities of mitochondrial proteins (i.e. proteins that are encoded in the mitochondrial DNA) to all intensities. The function returns the proportions in percent.
Mitochondrial proteins are defined by the Human Protein Atlas
(1139 proteins found by searching the Human Protein Atlas database for
mitochondrial sublocalization). These proteins are matched against the
proteins found in prot
. Currently, two types of ids are
encoded: either rownames(prot)
have to be "Symbol" or "Ensembl" ids.
Protein intensities have to in untransformed format such that proportions can be calculated as are.
## first back-transform to raw-values prot_raw <- exp(prot)
The protein ids of tanzer2020
are SYMBOL. We specify the id
argument
as "Symbol"
.
prop_mito <- calculateProportionOfMitochondrialProteins(values = prot_raw, id = "Symbol")
Using the function plotProportionOfMitochondrialProteins
we visualize the
proportion of mitochondrial proteins.
plotProportionOfMitochondrialProteins(proportion = prop_mito)
Another approach is to calculate contamination scores using the decoupleR
package [@Badia2021]. decouple
calculates the source activity per sample by
coupling a regulatory network with the weighted mean (run_wmean
) statistic.
The marker proteins correspond to the target nodes, while the contamination
("apoptosis"
or "necroptosis"
) corresponds to the source nodes. The
fold changes are taken as the mode of regulation (mor
).
Signature scores represent the number of standard deviations away from the
mean of an empirical null distribution of scores.
The contamination scores are calculated per sample. Per default, the markers
are taken from readMarkers
when there is no signatures
argument passed
to scoreSamples
. Alternatively, a tibble
can be passed to the argument
signatures
that contains the columns:
feature
: name of marker proteins,change
: associated negative or positive fold change associated with
the marker.If signatures
is passed as an argument to scoreSamples
the contamination
scores are calculated based on the markers defined in this object.
scores <- scoreSamples(values = prot, contamination = "apoptosis", n_perm = 100)
We visualize the scores using the plotSampleScores
function.
scores$treatment <- tanzer2020$treatment plotSampleScores(scores = scores) + ggplot2::facet_wrap(~ treatment, scales = "free_x") + ggplot2::theme(legend.position = "none")
Alternatively, the contaminations can also be assessed in a relative manner
by looking at the distributions of differences between the marker
proteins and a random subset (possibly including the marker proteins).
To do this, a signatures
object (tibble
) has to be passed that
does not contain the column change
(i.e. whenenver there is the column
change
in signatures
the decoupleR
approach will be taken).
As an example we use here the hallmark_genes
object that contains a list
of proteins (stored in the column feature
).
hallmark_genes
We then call the function scoreSamples
. We specify that we draw 100 random
proteins from prot
(n_perm
) and specify the marker gene set by
the signatures
argument.
scores <- scoreSamples(values = prot, contamination = "apoptosis", n_perm = 100, signatures = hallmark_genes)
Finally, we plot the scores.
plotSampleScores(scores = scores)
All software and respective versions to build this vignette are listed here:
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.