Microsatellite instability (MSI) is a feature of tumor genomes that has long been studied. A basic source of interest in MSI is the observation that tumors exhibiting MSI are often less aggressive. One theory is that these tumors produce neoantigens that are more likely to stimulate immune responses that reduce risk of metastasis.
This package provides data structures based on tables published with recent studies of MSI in TCGA samples.
Three supplemental tables have been transformed to SummarizedExperiment
instances. These are devoted to gene-centric enumeration of MSI
events in CDS (molpo_CDS
), 3'UTR (molpo_3utr
),
and 5'UTR (molpo_5utr
).
suppressPackageStartupMessages({ library(curatedMSIData) library(ggplot2) }) molpo_CDS
With the following code we approximately reproduce Figure 2a of the paper. We sorted by 'locus-specific' MSI recurrence, then grouped by gene symbol. CASP5 occurs twice in Figure 2a; the figure below sums all the events tabulated for CASP5. How to dissect the annotation of MSI enumeration for this gene is not immediately clear. The same situation arises with gene MAK16 for 3'UTR.
sev = apply(assay(molpo_CDS),1,sum) litass = assay(molpo_CDS[names(sort(sev,decreasing=TRUE)[1:50]),]) dd = data.frame(t(litass), tumor=molpo_CDS$tumor_type) library(dplyr) library(magrittr) names(dd) = gsub("_.*", "", names(dd)) library(reshape2) ddm = melt(dd) names(ddm)[2] = "gene" names(ddm)[3] = "msi_cds" ggplot(ddm %>% filter(msi_cds>0), aes(x=gene,fill=tumor)) + geom_bar() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
Figure 3a of the paper illustrates variation within and between tumor types in total numbers of MSI events detected in whole genome sequences. The gestalt of this figure can be obtained through the following code:
lit = molpo_WGS[1,molpo_WGS$tumor_type %in% c("UCEC", "BRCA", "KIRP")] myd = data.frame(totmsi=t(assay(lit[1,])), tumor=lit$tumor_type) ss = split(myd, myd$tumor) ss[[1]]$x = 1:nrow(ss[[1]]) ss[[2]]$x = 1:nrow(ss[[2]]) ss[[3]]$x = 1:nrow(ss[[3]]) ssd = do.call(rbind,ss) ggplot(ssd, aes(y=Total_number_MSI_events+1,x=x)) + geom_point() + facet_grid(.~tumor, scales="free_x")+ scale_y_log10()
The MSIsensor scores from the paper of Ding et al. are provided in a simple data.frame. The gestalt of their Figure 3C can be obtained via:
data(MSIsensor.10k) data(patient_to_tumor_code) names(MSIsensor.10k)[1] = "patient_barcode" mm = merge(MSIsensor.10k, patient_to_tumor_code) ggplot(mm, aes(y=MSIsensor.score+1, x=tumor_code)) + geom_boxplot() + coord_flip() + scale_y_log10()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.