all_times <- list()  # store the time for each chunk
knitr::knit_hooks$set(time_it = local({
  now <- NULL
  function(before, options) {
    if (before) {
      now <<- Sys.time()
    } else {
      res <- difftime(Sys.time(), now, units = "secs")
      all_times[[options$label]] <<- res
    }
  }
}))
knitr::opts_chunk$set(
  tidy = TRUE,
  tidy.opts = list(width.cutoff = 95),
  message = FALSE,
  warning = FALSE,
  time_it = TRUE
)

suppressMessages(library(scRepertoire))
data("contig_list") 
combined.TCR <- combineTCR(contig_list, 
                           samples = c("P17B", "P17L", "P18B", "P18L", 
                                            "P19B","P19L", "P20B", "P20L"))

clonalDiversity

Diversity can also be measured for samples or by other variables. Diversity metrics calculated, include: "shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", and "ACE". Please see the manual for more information on each metric and the underlying calculations.

Inherent in diversity calculations is a bias for increasing diversity with increasing repertoire size. clonalDiversity() will automatically downsample to the smallest repertoire size and perform bootstrapping to return the mean diversity estimates. If the output of diversity values are strange or minimally variable, it is likely due to a sample with small repertoire size.

n.boots
The number of calculations to perform (default = 100).

return.boots

skip.boots
Skip the bootstrapping calculations.

clonalDiversity(combined.TCR, 
                cloneCall = "gene")

There are 2 options for grouping in clonalDiversity() and it is important to understand the difference.

group.by

x.axis

combined.TCR <- addVariable(combined.TCR, 
                            variable.name = "Patient", 
                             variables = c("P17", "P17", "P18", "P18", 
"P19","P19", "P20", "P20"))

clonalDiversity(combined.TCR, 
                cloneCall = "gene", 
                group.by = "Patient")

clonalDiversity(combined.TCR, 
                cloneCall = "gene", 
                x.axis = "Patient")

As a default, clonalDiversity() will return all the metrics calculated - "shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", and "ACE". Selecting a single or a subset of these methods using the metrics parameter.

#Return only a subset of metrics
clonalDiversity(combined.TCR, 
                metrics = c("shannon", "ACE"),
                cloneCall = "gene", 
                n.boots = 20)

clonaRarefaction

We can also use Hill numbers to estimate the rarefaction, or estimating species richness, using the the abundance of clones across groupings. Underlying the rarefaction calculation is the use of observed receptor of abundance to compute diversity.

hill.numbers

plot.type

This function relies on the iNEXT with the accompanying manuscript. Like the other wrapping functions in scRepertoire, please cite the original work. The sample completeness curve (plot.type = 2), may not show full sample coverage due to the size/diversity of the input data.

Rarefaction using Species Richness (q = 0)

clonalRarefaction(combined.TCR,
                  plot.type = 1,
                  hill.numbers = 0,
                  n.boots = 2)

clonalRarefaction(combined.TCR,
                  plot.type = 2,
                  hill.numbers = 0,
                  n.boots = 2)

clonalRarefaction(combined.TCR,
                  plot.type = 3,
                  hill.numbers = 0,
                  n.boots = 2)

Rarefaction using Shannon Diversity (q = 1)

clonalRarefaction(combined.TCR,
                  plot.type = 1,
                  hill.numbers = 1,
                  n.boots = 2)

clonalRarefaction(combined.TCR,
                  plot.type = 2,
                  hill.numbers = 1,
                  n.boots = 2)

clonalRarefaction(combined.TCR,
                  plot.type = 3,
                  hill.numbers = 1,
                  n.boots = 2)

clonalSizeDistribution

Another method for modeling the repertoire distribution is a discrete gamma-GPD spliced threshold model, proposed by Koch et al. The spliced model models the repertoire and allows for the application of a power law distribution for larger clonal-expanded sequences and a Poisson distribution for smaller clones. After fitting the models, repertoires can be compared using Euclidean distance.

If using this function, please read/cite Koch et al. and check out the powerTCR R package.

clonalSizeDistribution(combined.TCR, 
                       cloneCall = "aa", 
                       method= "ward.D2")

clonalOverlap

If you are interested in measures of similarity between the samples loaded into scRepertoire, using clonalOverlap() can assist in the visualization.

The underlying clonalOverlap() calculation varies by the method parameter, more information on the exact calculations are available in the manual.

method

clonalOverlap(combined.TCR, 
              cloneCall = "strict", 
              method = "morisita")

clonalOverlap(combined.TCR, 
              cloneCall = "strict", 
              method = "raw")


ncborcherding/scRepertoire documentation built on Jan. 4, 2025, 3:23 a.m.